You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/12/08 00:57:56 UTC

[GitHub] piiswrong closed pull request #8994: Fix race condition in engine start/stop

piiswrong closed pull request #8994: Fix race condition in engine start/stop
URL: https://github.com/apache/incubator-mxnet/pull/8994
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/CONTRIBUTORS.md b/CONTRIBUTORS.md
index 7847a5e089..64cd29dc0b 100644
--- a/CONTRIBUTORS.md
+++ b/CONTRIBUTORS.md
@@ -7,6 +7,7 @@ Committers
 ----------
 Committers are people who have made substantial contribution to the project and being active.
 The committers are the granted write access to the project.
+A full list of committers can be found here:  http://incubator.apache.org/projects/mxnet.html
 
 * [Bing Xu](https://github.com/antinucleon)
   - Bing is the initiator and major contributor of operators and ndarray modules of mxnet.
@@ -39,6 +40,7 @@ The committers are the granted write access to the project.
   - Zixuan is one of major maintainers of mxnet scala package.
 * [Yuan Tang](https://github.com/terrytangyuan)
   - Yuan is one of major maintainers of mxnet scala package.
+* [Chris Olivier](https://github.com/cjolivier01)
 
 ### Become a Committer
 MXNet is a opensource project and we are actively looking for new committers
diff --git a/LICENSE b/LICENSE
index 01dfcf4679..1a02899fee 100644
--- a/LICENSE
+++ b/LICENSE
@@ -96,6 +96,7 @@
           Derivative Works a copy of this License; and
 
       (b) You must cause any modified files to carry prominent notices
+
           stating that You changed the files; and
 
       (c) You must retain, in the Source form of any Derivative Works
@@ -224,8 +225,10 @@
     7. mshadow - For details, see, mshadow/LICENSE
     8. nnvm/dmlc-core - For details, see, nnvm/dmlc-core/LICENSE
     9. nnvm - For details, see, nnvm/LICENSE
-    10. nnvm-fusion - For details, see, nnvm/plugin/nnvm-fusion/LICENSE
-    11. ps-lite - For details, see, ps-lite/LICENSE
+    10. nnvm/tvm - For details, see, nnvm/tvm/LICENSE
+    11. nnvm/tvm/HalideIR/LICENSE - For details, see,  nnvm/tvm/HalideIR/LICENSE
+    12. nnvm-fusion - For details, see, nnvm/plugin/nnvm-fusion/LICENSE
+    13. ps-lite - For details, see, ps-lite/LICENSE
 
     ========================================================================
     MIT licenses
@@ -235,6 +238,41 @@
     2. Faster R-CNN - For details, see example/rcnn/LICENSE
     3. tree_lstm - For details, see example/gluon/tree_lstm/LICENSE
 
+    ========================================================================
+    JQuery License (MIT license)
+    ========================================================================
+    jQuery JavaScript Library v1.11.1
+    http://jquery.com/
+
+    Includes Sizzle.js
+    http://sizzlejs.com/
+
+    Copyright 2005, 2014 jQuery Foundation, Inc. and other contributors
+    ----
+    Released under the MIT license
+    MIT License
+
+    Permission is hereby granted, free of charge, to any person obtaining a
+    copy of this software and associated documentation files (the "Software"),
+    to deal in the Software without restriction, including without limitation
+    the rights to use, copy, modify, merge, publish, distribute, sublicense,
+    and/or sell copies of the Software, and to permit persons to whom the
+    Software is furnished to do so, subject to the following conditions:
+
+    The above copyright notice and this permission notice shall be included
+    in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+    OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+    ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+    OTHER DEALINGS IN THE SOFTWARE.
+    ----
+    http://jquery.org/license
+
+    Date: 2014-05-01T17:42Z
 
     ========================================================================
     NVIDIA Licenses
@@ -350,10 +388,40 @@
     (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
     LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
     ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF TH
+E USE OF THIS
     SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
     The views and conclusions contained in the software and documentation are those
     of the authors and should not be interpreted as representing official policies,
     either expressed or implied, of the FreeBSD Project.
 
+
+    3. Sphinx JavaScript utilties for the full-text search
+
+    For details, see, docs/_static/searchtools_custom.js
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions are
+    met:
+
+    * Redistributions of source code must retain the above copyright
+     notice, this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above copyright
+     notice, this list of conditions and the following disclaimer in the
+     documentation and/or other materials provided with the distribution.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
diff --git a/Makefile b/Makefile
index 5d5dcdfc79..ceed645043 100644
--- a/Makefile
+++ b/Makefile
@@ -435,7 +435,7 @@ test: $(TEST)
 lint: cpplint rcpplint jnilint pylint
 
 cpplint:
-	python2 dmlc-core/scripts/lint.py mxnet cpp include src plugin cpp-package tests \
+	dmlc-core/scripts/lint.py mxnet cpp include src plugin cpp-package tests \
 	--exclude_path src/operator/contrib/ctc_include
 
 pylint:
@@ -467,7 +467,7 @@ cyclean:
 
 # R related shortcuts
 rcpplint:
-	python2 dmlc-core/scripts/lint.py mxnet-rcpp ${LINT_LANG} R-package/src
+	dmlc-core/scripts/lint.py mxnet-rcpp ${LINT_LANG} R-package/src
 
 rpkg:
 	mkdir -p R-package/inst
@@ -525,7 +525,7 @@ scaladeploy:
 			-Dlddeps="$(LIB_DEP) $(ROOTDIR)/lib/libmxnet.a")
 
 jnilint:
-	python2 dmlc-core/scripts/lint.py mxnet-jnicpp cpp scala-package/native/src
+	dmlc-core/scripts/lint.py mxnet-jnicpp cpp scala-package/native/src
 
 ifneq ($(EXTRA_OPERATORS),)
 clean: cyclean $(EXTRA_PACKAGES_CLEAN)
diff --git a/NEWS.md b/NEWS.md
index 666b5d88e6..fc6b10188f 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,5 +1,70 @@
 MXNet Change Log
 ================
+## 1.0.0
+### Performance
+  - Enhanced the performance of `sparse.dot` operator.
+  - MXNet now automatically set OpenMP to use all available CPU cores to maximize CPU utilization when `NUM_OMP_THREADS` is not set.
+  - Unary and binary operators now avoid using OpenMP on small arrays if using OpenMP actually hurts performance due to multithreading overhead.
+  - Significantly improved performance of `broadcast_add`, `broadcast_mul`, etc on CPU.
+  - Added bulk execution to imperative mode. You can control segment size with `mxnet.engine.bulk`. As a result, the speed of Gluon in hybrid mode is improved, especially on small networks and multiple GPUs.
+  - Improved speed for `ctypes` invocation from Python frontend.
+### New Features - Gradient Compression [Experimental]
+  - Speed up multi-GPU and distributed training by compressing communication of gradients. This is especially effective when training networks with large fully-connected layers. In Gluon this can be activated with `compression_params` in Trainer.
+### New Features - Support of NVIDIA Collective Communication Library (NCCL) [Experimental]
+  - Use `kvstore=?nccl?` for (in some cases) faster training on multiple GPUs.
+  - Significantly faster than kvstore=?device? when batch size is small.
+  - It is recommended to set environment variable `NCCL_LAUNCH_MODE` to `PARALLEL` when using NCCL version 2.1 or newer.
+### New Features - Advanced Indexing [General Availability]
+  - NDArray now supports advanced indexing (both slice and assign) as specified by the numpy standard: https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing with the following restrictions:
+    - if key is a list type, only a list of integers is supported, e.g. `key=[1, 2]` is supported, while not for `key=[[1, 2]]`.
+    - Ellipsis (...) and np.newaxis are not supported.
+    - `Boolean` array indexing is not supported.
+### New Features - Gluon [General Availability]
+  - Performance optimizations discussed above.
+  - Added support for loading data in parallel with multiple processes to `gluon.data.DataLoader`. The number of workers can be set with `num_worker`. Does not support windows yet.
+  - Added Block.cast to support networks with different data types, e.g. `float16`.
+  - Added Lambda block for wrapping a user defined function as a block.
+  - Generalized `gluon.data.ArrayDataset` to support arbitrary number of arrays.
+### New Features - ARM / Raspberry Pi support [Experimental]
+  - MXNet now compiles and runs on ARMv6, ARMv7, ARMv64 including Raspberry Pi devices. See https://github.com/apache/incubator-mxnet/tree/master/docker_multiarch for more information.
+### New Features - NVIDIA Jetson support [Experimental]
+  - MXNet now compiles and runs on NVIDIA Jetson TX2 boards with GPU acceleration.
+  - You can install the python MXNet package on a Jetson board by running - `$ pip install mxnet-jetson-tx2`.
+### New Features - Sparse Tensor Support [General Availability]
+  - Added more sparse operators: `contrib.SparseEmbedding`, `sparse.sum` and `sparse.mean`. 
+  - Added `asscipy()` for easier conversion to scipy.
+  - Added `check_format()` for sparse ndarrays to check if the array format is valid.
+### Bug-fixes  
+  - Fixed a[-1] indexing doesn't work on `NDArray`.
+  - Fixed `expand_dims` if axis < 0.
+  - Fixed a bug that causes topk to produce incorrect result on large arrays.
+  - Improved numerical precision of unary and binary operators for `float64` data.
+  - Fixed derivatives of log2 and log10. They used to be the same with log.
+  - Fixed a bug that causes MXNet to hang after fork. Note that you still cannot use GPU in child processes after fork due to limitations of CUDA.
+  - Fixed a bug that causes `CustomOp` to fail when using auxiliary states.
+  - Fixed a security bug that is causing MXNet to listen on all available interfaces when running training in distributed mode.
+### Doc Updates
+  - Added a security best practices document under FAQ section.
+  - Fixed License Headers including restoring copyright attributions.
+  - Documentation updates. 
+  - Links for viewing source.
+ 
+ For more information and examples, see [full release notes](https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.0+Release+Notes)
+
+
+## 0.12.1
+### Bug-fixes
+  - Added GPU support for the `syevd` operator which ensures that there is GPU support for all linalg-operators.
+  - Bugfix for `syevd` on CPU such that it works for `float32`.
+  - Fixed API call when `OMP_NUM_THREADS` environment variable is set. 
+  - Fixed `MakeNonlossGradNode` bug.
+  - Fixed bug related to passing `dtype` to `array()`. 
+  - Fixed some minor bugs for sparse distributed training.
+  - Fixed a bug on `Slice` accessing uninitialized memory in `param.begin` in the file `matrix_op-inl.h`. 
+  - Fixed `gluon.data.RecordFileDataset`.
+  - Fixed a bug that caused `autograd` to crash on some networks.
+  
+  
 ## 0.12.0
 ### Performance
   - Added full support for NVIDIA Volta GPU Architecture and CUDA 9. Training CNNs is up to 3.5x faster than Pascal when using float16 precision.
diff --git a/R-package/DESCRIPTION b/R-package/DESCRIPTION
index 3d57ea876f..6e0f93294b 100644
--- a/R-package/DESCRIPTION
+++ b/R-package/DESCRIPTION
@@ -1,7 +1,7 @@
 Package: mxnet
 Type: Package
 Title: MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
-Version: 0.12.1
+Version: 1.0.0
 Date: 2017-06-27
 Author: Tianqi Chen, Qiang Kou, Tong He
 Maintainer: Qiang Kou <qk...@qkou.info>
diff --git a/README.md b/README.md
index fc252a7a72..6e7dc41c1e 100644
--- a/README.md
+++ b/README.md
@@ -22,6 +22,8 @@ deep learning systems, and interesting insights of DL systems for hackers.
 
 What's New
 ----------
+* [Version 1.0.0 Release](https://github.com/apache/incubator-mxnet/releases/tag/1.0.0) - MXNet 1.0.0 Release.
+* [Version 0.12.1 Release](https://github.com/apache/incubator-mxnet/releases/tag/0.12.1) - MXNet 0.12.1 Patch Release.
 * [Version 0.12.0 Release](https://github.com/apache/incubator-mxnet/releases/tag/0.12.0) - MXNet 0.12.0 Release.
 * [Version 0.11.0 Release](https://github.com/apache/incubator-mxnet/releases/tag/0.11.0) - MXNet 0.11.0 Release.
 * [Apache Incubator](http://incubator.apache.org/projects/mxnet.html) - We are now an Apache Incubator project.
diff --git a/amalgamation/prep_nnvm.sh b/amalgamation/prep_nnvm.sh
index baf6d4d2d0..60c9674330 100755
--- a/amalgamation/prep_nnvm.sh
+++ b/amalgamation/prep_nnvm.sh
@@ -1,4 +1,20 @@
 #! /bin/bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
 DMLC_CORE=$(pwd)/../dmlc-core
 cd ../nnvm/amalgamation
 make clean
diff --git a/cmake/Modules/FindJeMalloc.cmake b/cmake/Modules/FindJeMalloc.cmake
index f3ca06faa3..0ab1cec55f 100644
--- a/cmake/Modules/FindJeMalloc.cmake
+++ b/cmake/Modules/FindJeMalloc.cmake
@@ -1,27 +1,3 @@
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-
-
-# Copyright (c)      2014 Thomas Heller
-# Copyright (c) 2007-2012 Hartmut Kaiser
-# Copyright (c) 2010-2011 Matt Anderson
-# Copyright (c) 2011      Bryce Lelbach
-#
-#----
 # Distributed under the Boost Software License, Version 1.0.
 # Boost Software License - Version 1.0 - August 17th, 2003
 #
diff --git a/docker/run.sh b/docker/run.sh
old mode 100644
new mode 100755
diff --git a/docs/_static/cn.svg b/docs/_static/cn.svg
index 515176d60f..9fb3fc084c 100644
--- a/docs/_static/cn.svg
+++ b/docs/_static/cn.svg
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="640" height="480" viewBox="-5 -5 12.8 9.6">
   <title>
     Flag of the People&apos;s Republic of China
diff --git a/docs/_static/js/auto_module_index.js b/docs/_static/js/auto_module_index.js
index 7f4e185655..83bdbf3717 100644
--- a/docs/_static/js/auto_module_index.js
+++ b/docs/_static/js/auto_module_index.js
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 function auto_index(module) {
   $(document).ready(function () {
     // find all classes or functions
@@ -21,4 +40,4 @@ function auto_index(module) {
     html += "</ul>";
     li_node.append(html);
   });
-}
\ No newline at end of file
+}
diff --git a/docs/_static/js/clipboard.min.js b/docs/_static/js/clipboard.min.js
old mode 100755
new mode 100644
index 1993676f99..a23c4e1384
--- a/docs/_static/js/clipboard.min.js
+++ b/docs/_static/js/clipboard.min.js
@@ -1,7 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /*!
  * clipboard.js v1.6.1
  * https://zenorocha.github.io/clipboard.js
  *
  * Licensed MIT ? Zeno Rocha
  */
-!function(e){if("object"==typeof exports&&"undefined"!=typeof module)module.exports=e();else if("function"==typeof define&&define.amd)define([],e);else{var t;t="undefined"!=typeof window?window:"undefined"!=typeof global?global:"undefined"!=typeof self?self:this,t.Clipboard=e()}}(function(){var e,t,n;return function e(t,n,o){function i(a,c){if(!n[a]){if(!t[a]){var l="function"==typeof require&&require;if(!c&&l)return l(a,!0);if(r)return r(a,!0);var u=new Error("Cannot find module '"+a+"'");throw u.code="MODULE_NOT_FOUND",u}var s=n[a]={exports:{}};t[a][0].call(s.exports,function(e){var n=t[a][1][e];return i(n?n:e)},s,s.exports,e,t,n,o)}return n[a].exports}for(var r="function"==typeof require&&require,a=0;a<o.length;a++)i(o[a]);return i}({1:[function(e,t,n){function o(e,t){for(;e&&e.nodeType!==i;){if(e.matches(t))return e;e=e.parentNode}}var i=9;if("undefined"!=typeof Element&&!Element.prototype.matches){var r=Element.prototype;r.matches=r.matchesSelector||r.mozMatchesSelector||r.msMa
 tchesSelector||r.oMatchesSelector||r.webkitMatchesSelector}t.exports=o},{}],2:[function(e,t,n){function o(e,t,n,o,r){var a=i.apply(this,arguments);return e.addEventListener(n,a,r),{destroy:function(){e.removeEventListener(n,a,r)}}}function i(e,t,n,o){return function(n){n.delegateTarget=r(n.target,t),n.delegateTarget&&o.call(e,n)}}var r=e("./closest");t.exports=o},{"./closest":1}],3:[function(e,t,n){n.node=function(e){return void 0!==e&&e instanceof HTMLElement&&1===e.nodeType},n.nodeList=function(e){var t=Object.prototype.toString.call(e);return void 0!==e&&("[object NodeList]"===t||"[object HTMLCollection]"===t)&&"length"in e&&(0===e.length||n.node(e[0]))},n.string=function(e){return"string"==typeof e||e instanceof String},n.fn=function(e){var t=Object.prototype.toString.call(e);return"[object Function]"===t}},{}],4:[function(e,t,n){function o(e,t,n){if(!e&&!t&&!n)throw new Error("Missing required arguments");if(!c.string(t))throw new TypeError("Second argument must be a String");i
 f(!c.fn(n))throw new TypeError("Third argument must be a Function");if(c.node(e))return i(e,t,n);if(c.nodeList(e))return r(e,t,n);if(c.string(e))return a(e,t,n);throw new TypeError("First argument must be a String, HTMLElement, HTMLCollection, or NodeList")}function i(e,t,n){return e.addEventListener(t,n),{destroy:function(){e.removeEventListener(t,n)}}}function r(e,t,n){return Array.prototype.forEach.call(e,function(e){e.addEventListener(t,n)}),{destroy:function(){Array.prototype.forEach.call(e,function(e){e.removeEventListener(t,n)})}}}function a(e,t,n){return l(document.body,e,t,n)}var c=e("./is"),l=e("delegate");t.exports=o},{"./is":3,delegate:2}],5:[function(e,t,n){function o(e){var t;if("SELECT"===e.nodeName)e.focus(),t=e.value;else if("INPUT"===e.nodeName||"TEXTAREA"===e.nodeName){var n=e.hasAttribute("readonly");n||e.setAttribute("readonly",""),e.select(),e.setSelectionRange(0,e.value.length),n||e.removeAttribute("readonly"),t=e.value}else{e.hasAttribute("contenteditable")&&
 e.focus();var o=window.getSelection(),i=document.createRange();i.selectNodeContents(e),o.removeAllRanges(),o.addRange(i),t=o.toString()}return t}t.exports=o},{}],6:[function(e,t,n){function o(){}o.prototype={on:function(e,t,n){var o=this.e||(this.e={});return(o[e]||(o[e]=[])).push({fn:t,ctx:n}),this},once:function(e,t,n){function o(){i.off(e,o),t.apply(n,arguments)}var i=this;return o._=t,this.on(e,o,n)},emit:function(e){var t=[].slice.call(arguments,1),n=((this.e||(this.e={}))[e]||[]).slice(),o=0,i=n.length;for(o;o<i;o++)n[o].fn.apply(n[o].ctx,t);return this},off:function(e,t){var n=this.e||(this.e={}),o=n[e],i=[];if(o&&t)for(var r=0,a=o.length;r<a;r++)o[r].fn!==t&&o[r].fn._!==t&&i.push(o[r]);return i.length?n[e]=i:delete n[e],this}},t.exports=o},{}],7:[function(t,n,o){!function(i,r){if("function"==typeof e&&e.amd)e(["module","select"],r);else if("undefined"!=typeof o)r(n,t("select"));else{var a={exports:{}};r(a,i.select),i.clipboardAction=a.exports}}(this,function(e,t){"use strict
 ";function n(e){return e&&e.__esModule?e:{default:e}}function o(e,t){if(!(e instanceof t))throw new TypeError("Cannot call a class as a function")}var i=n(t),r="function"==typeof Symbol&&"symbol"==typeof Symbol.iterator?function(e){return typeof e}:function(e){return e&&"function"==typeof Symbol&&e.constructor===Symbol&&e!==Symbol.prototype?"symbol":typeof e},a=function(){function e(e,t){for(var n=0;n<t.length;n++){var o=t[n];o.enumerable=o.enumerable||!1,o.configurable=!0,"value"in o&&(o.writable=!0),Object.defineProperty(e,o.key,o)}}return function(t,n,o){return n&&e(t.prototype,n),o&&e(t,o),t}}(),c=function(){function e(t){o(this,e),this.resolveOptions(t),this.initSelection()}return a(e,[{key:"resolveOptions",value:function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:{};this.action=t.action,this.emitter=t.emitter,this.target=t.target,this.text=t.text,this.trigger=t.trigger,this.selectedText=""}},{key:"initSelection",value:function e(){this.text?this.selectFak
 e():this.target&&this.selectTarget()}},{key:"selectFake",value:function e(){var t=this,n="rtl"==document.documentElement.getAttribute("dir");this.removeFake(),this.fakeHandlerCallback=function(){return t.removeFake()},this.fakeHandler=document.body.addEventListener("click",this.fakeHandlerCallback)||!0,this.fakeElem=document.createElement("textarea"),this.fakeElem.style.fontSize="12pt",this.fakeElem.style.border="0",this.fakeElem.style.padding="0",this.fakeElem.style.margin="0",this.fakeElem.style.position="absolute",this.fakeElem.style[n?"right":"left"]="-9999px";var o=window.pageYOffset||document.documentElement.scrollTop;this.fakeElem.style.top=o+"px",this.fakeElem.setAttribute("readonly",""),this.fakeElem.value=this.text,document.body.appendChild(this.fakeElem),this.selectedText=(0,i.default)(this.fakeElem),this.copyText()}},{key:"removeFake",value:function e(){this.fakeHandler&&(document.body.removeEventListener("click",this.fakeHandlerCallback),this.fakeHandler=null,this.fakeH
 andlerCallback=null),this.fakeElem&&(document.body.removeChild(this.fakeElem),this.fakeElem=null)}},{key:"selectTarget",value:function e(){this.selectedText=(0,i.default)(this.target),this.copyText()}},{key:"copyText",value:function e(){var t=void 0;try{t=document.execCommand(this.action)}catch(e){t=!1}this.handleResult(t)}},{key:"handleResult",value:function e(t){this.emitter.emit(t?"success":"error",{action:this.action,text:this.selectedText,trigger:this.trigger,clearSelection:this.clearSelection.bind(this)})}},{key:"clearSelection",value:function e(){this.target&&this.target.blur(),window.getSelection().removeAllRanges()}},{key:"destroy",value:function e(){this.removeFake()}},{key:"action",set:function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:"copy";if(this._action=t,"copy"!==this._action&&"cut"!==this._action)throw new Error('Invalid "action" value, use either "copy" or "cut"')},get:function e(){return this._action}},{key:"target",set:function e(t){if(voi
 d 0!==t){if(!t||"object"!==("undefined"==typeof t?"undefined":r(t))||1!==t.nodeType)throw new Error('Invalid "target" value, use a valid Element');if("copy"===this.action&&t.hasAttribute("disabled"))throw new Error('Invalid "target" attribute. Please use "readonly" instead of "disabled" attribute');if("cut"===this.action&&(t.hasAttribute("readonly")||t.hasAttribute("disabled")))throw new Error('Invalid "target" attribute. You can\'t cut text from elements with "readonly" or "disabled" attributes');this._target=t}},get:function e(){return this._target}}]),e}();e.exports=c})},{select:5}],8:[function(t,n,o){!function(i,r){if("function"==typeof e&&e.amd)e(["module","./clipboard-action","tiny-emitter","good-listener"],r);else if("undefined"!=typeof o)r(n,t("./clipboard-action"),t("tiny-emitter"),t("good-listener"));else{var a={exports:{}};r(a,i.clipboardAction,i.tinyEmitter,i.goodListener),i.clipboard=a.exports}}(this,function(e,t,n,o){"use strict";function i(e){return e&&e.__esModule?e:
 {default:e}}function r(e,t){if(!(e instanceof t))throw new TypeError("Cannot call a class as a function")}function a(e,t){if(!e)throw new ReferenceError("this hasn't been initialised - super() hasn't been called");return!t||"object"!=typeof t&&"function"!=typeof t?e:t}function c(e,t){if("function"!=typeof t&&null!==t)throw new TypeError("Super expression must either be null or a function, not "+typeof t);e.prototype=Object.create(t&&t.prototype,{constructor:{value:e,enumerable:!1,writable:!0,configurable:!0}}),t&&(Object.setPrototypeOf?Object.setPrototypeOf(e,t):e.__proto__=t)}function l(e,t){var n="data-clipboard-"+e;if(t.hasAttribute(n))return t.getAttribute(n)}var u=i(t),s=i(n),f=i(o),d=function(){function e(e,t){for(var n=0;n<t.length;n++){var o=t[n];o.enumerable=o.enumerable||!1,o.configurable=!0,"value"in o&&(o.writable=!0),Object.defineProperty(e,o.key,o)}}return function(t,n,o){return n&&e(t.prototype,n),o&&e(t,o),t}}(),h=function(e){function t(e,n){r(this,t);var o=a(this,(t
 .__proto__||Object.getPrototypeOf(t)).call(this));return o.resolveOptions(n),o.listenClick(e),o}return c(t,e),d(t,[{key:"resolveOptions",value:function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:{};this.action="function"==typeof t.action?t.action:this.defaultAction,this.target="function"==typeof t.target?t.target:this.defaultTarget,this.text="function"==typeof t.text?t.text:this.defaultText}},{key:"listenClick",value:function e(t){var n=this;this.listener=(0,f.default)(t,"click",function(e){return n.onClick(e)})}},{key:"onClick",value:function e(t){var n=t.delegateTarget||t.currentTarget;this.clipboardAction&&(this.clipboardAction=null),this.clipboardAction=new u.default({action:this.action(n),target:this.target(n),text:this.text(n),trigger:n,emitter:this})}},{key:"defaultAction",value:function e(t){return l("action",t)}},{key:"defaultTarget",value:function e(t){var n=l("target",t);if(n)return document.querySelector(n)}},{key:"defaultText",value:function e(t){r
 eturn l("text",t)}},{key:"destroy",value:function e(){this.listener.destroy(),this.clipboardAction&&(this.clipboardAction.destroy(),this.clipboardAction=null)}}],[{key:"isSupported",value:function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:["copy","cut"],n="string"==typeof t?[t]:t,o=!!document.queryCommandSupported;return n.forEach(function(e){o=o&&!!document.queryCommandSupported(e)}),o}}]),t}(s.default);e.exports=h})},{"./clipboard-action":7,"good-listener":4,"tiny-emitter":6}]},{},[8])(8)});
\ No newline at end of file
+!function(e){if("object"==typeof exports&&"undefined"!=typeof module)module.exports=e();else if("function"==typeof define&&define.amd)define([],e);else{var t;t="undefined"!=typeof window?window:"undefined"!=typeof global?global:"undefined"!=typeof self?self:this,t.Clipboard=e()}}(function(){var e,t,n;return function e(t,n,o){function i(a,c){if(!n[a]){if(!t[a]){var l="function"==typeof require&&require;if(!c&&l)return l(a,!0);if(r)return r(a,!0);var u=new Error("Cannot find module '"+a+"'");throw u.code="MODULE_NOT_FOUND",u}var s=n[a]={exports:{}};t[a][0].call(s.exports,function(e){var n=t[a][1][e];return i(n?n:e)},s,s.exports,e,t,n,o)}return n[a].exports}for(var r="function"==typeof require&&require,a=0;a<o.length;a++)i(o[a]);return i}({1:[function(e,t,n){function o(e,t){for(;e&&e.nodeType!==i;){if(e.matches(t))return e;e=e.parentNode}}var i=9;if("undefined"!=typeof Element&&!Element.prototype.matches){var r=Element.prototype;r.matches=r.matchesSelector||r.mozMatchesSelector||r.msMa
 tchesSelector||r.oMatchesSelector||r.webkitMatchesSelector}t.exports=o},{}],2:[function(e,t,n){function o(e,t,n,o,r){var a=i.apply(this,arguments);return e.addEventListener(n,a,r),{destroy:function(){e.removeEventListener(n,a,r)}}}function i(e,t,n,o){return function(n){n.delegateTarget=r(n.target,t),n.delegateTarget&&o.call(e,n)}}var r=e("./closest");t.exports=o},{"./closest":1}],3:[function(e,t,n){n.node=function(e){return void 0!==e&&e instanceof HTMLElement&&1===e.nodeType},n.nodeList=function(e){var t=Object.prototype.toString.call(e);return void 0!==e&&("[object NodeList]"===t||"[object HTMLCollection]"===t)&&"length"in e&&(0===e.length||n.node(e[0]))},n.string=function(e){return"string"==typeof e||e instanceof String},n.fn=function(e){var t=Object.prototype.toString.call(e);return"[object Function]"===t}},{}],4:[function(e,t,n){function o(e,t,n){if(!e&&!t&&!n)throw new Error("Missing required arguments");if(!c.string(t))throw new TypeError("Second argument must be a String");i
 f(!c.fn(n))throw new TypeError("Third argument must be a Function");if(c.node(e))return i(e,t,n);if(c.nodeList(e))return r(e,t,n);if(c.string(e))return a(e,t,n);throw new TypeError("First argument must be a String, HTMLElement, HTMLCollection, or NodeList")}function i(e,t,n){return e.addEventListener(t,n),{destroy:function(){e.removeEventListener(t,n)}}}function r(e,t,n){return Array.prototype.forEach.call(e,function(e){e.addEventListener(t,n)}),{destroy:function(){Array.prototype.forEach.call(e,function(e){e.removeEventListener(t,n)})}}}function a(e,t,n){return l(document.body,e,t,n)}var c=e("./is"),l=e("delegate");t.exports=o},{"./is":3,delegate:2}],5:[function(e,t,n){function o(e){var t;if("SELECT"===e.nodeName)e.focus(),t=e.value;else if("INPUT"===e.nodeName||"TEXTAREA"===e.nodeName){var n=e.hasAttribute("readonly");n||e.setAttribute("readonly",""),e.select(),e.setSelectionRange(0,e.value.length),n||e.removeAttribute("readonly"),t=e.value}else{e.hasAttribute("contenteditable")&&
 e.focus();var o=window.getSelection(),i=document.createRange();i.selectNodeContents(e),o.removeAllRanges(),o.addRange(i),t=o.toString()}return t}t.exports=o},{}],6:[function(e,t,n){function o(){}o.prototype={on:function(e,t,n){var o=this.e||(this.e={});return(o[e]||(o[e]=[])).push({fn:t,ctx:n}),this},once:function(e,t,n){function o(){i.off(e,o),t.apply(n,arguments)}var i=this;return o._=t,this.on(e,o,n)},emit:function(e){var t=[].slice.call(arguments,1),n=((this.e||(this.e={}))[e]||[]).slice(),o=0,i=n.length;for(o;o<i;o++)n[o].fn.apply(n[o].ctx,t);return this},off:function(e,t){var n=this.e||(this.e={}),o=n[e],i=[];if(o&&t)for(var r=0,a=o.length;r<a;r++)o[r].fn!==t&&o[r].fn._!==t&&i.push(o[r]);return i.length?n[e]=i:delete n[e],this}},t.exports=o},{}],7:[function(t,n,o){!function(i,r){if("function"==typeof e&&e.amd)e(["module","select"],r);else if("undefined"!=typeof o)r(n,t("select"));else{var a={exports:{}};r(a,i.select),i.clipboardAction=a.exports}}(this,function(e,t){"use strict
 ";function n(e){return e&&e.__esModule?e:{default:e}}function o(e,t){if(!(e instanceof t))throw new TypeError("Cannot call a class as a function")}var i=n(t),r="function"==typeof Symbol&&"symbol"==typeof Symbol.iterator?function(e){return typeof e}:function(e){return e&&"function"==typeof Symbol&&e.constructor===Symbol&&e!==Symbol.prototype?"symbol":typeof e},a=function(){function e(e,t){for(var n=0;n<t.length;n++){var o=t[n];o.enumerable=o.enumerable||!1,o.configurable=!0,"value"in o&&(o.writable=!0),Object.defineProperty(e,o.key,o)}}return function(t,n,o){return n&&e(t.prototype,n),o&&e(t,o),t}}(),c=function(){function e(t){o(this,e),this.resolveOptions(t),this.initSelection()}return a(e,[{key:"resolveOptions",value:function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:{};this.action=t.action,this.emitter=t.emitter,this.target=t.target,this.text=t.text,this.trigger=t.trigger,this.selectedText=""}},{key:"initSelection",value:function e(){this.text?this.selectFak
 e():this.target&&this.selectTarget()}},{key:"selectFake",value:function e(){var t=this,n="rtl"==document.documentElement.getAttribute("dir");this.removeFake(),this.fakeHandlerCallback=function(){return t.removeFake()},this.fakeHandler=document.body.addEventListener("click",this.fakeHandlerCallback)||!0,this.fakeElem=document.createElement("textarea"),this.fakeElem.style.fontSize="12pt",this.fakeElem.style.border="0",this.fakeElem.style.padding="0",this.fakeElem.style.margin="0",this.fakeElem.style.position="absolute",this.fakeElem.style[n?"right":"left"]="-9999px";var o=window.pageYOffset||document.documentElement.scrollTop;this.fakeElem.style.top=o+"px",this.fakeElem.setAttribute("readonly",""),this.fakeElem.value=this.text,document.body.appendChild(this.fakeElem),this.selectedText=(0,i.default)(this.fakeElem),this.copyText()}},{key:"removeFake",value:function e(){this.fakeHandler&&(document.body.removeEventListener("click",this.fakeHandlerCallback),this.fakeHandler=null,this.fakeH
 andlerCallback=null),this.fakeElem&&(document.body.removeChild(this.fakeElem),this.fakeElem=null)}},{key:"selectTarget",value:function e(){this.selectedText=(0,i.default)(this.target),this.copyText()}},{key:"copyText",value:function e(){var t=void 0;try{t=document.execCommand(this.action)}catch(e){t=!1}this.handleResult(t)}},{key:"handleResult",value:function e(t){this.emitter.emit(t?"success":"error",{action:this.action,text:this.selectedText,trigger:this.trigger,clearSelection:this.clearSelection.bind(this)})}},{key:"clearSelection",value:function e(){this.target&&this.target.blur(),window.getSelection().removeAllRanges()}},{key:"destroy",value:function e(){this.removeFake()}},{key:"action",set:function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:"copy";if(this._action=t,"copy"!==this._action&&"cut"!==this._action)throw new Error('Invalid "action" value, use either "copy" or "cut"')},get:function e(){return this._action}},{key:"target",set:function e(t){if(voi
 d 0!==t){if(!t||"object"!==("undefined"==typeof t?"undefined":r(t))||1!==t.nodeType)throw new Error('Invalid "target" value, use a valid Element');if("copy"===this.action&&t.hasAttribute("disabled"))throw new Error('Invalid "target" attribute. Please use "readonly" instead of "disabled" attribute');if("cut"===this.action&&(t.hasAttribute("readonly")||t.hasAttribute("disabled")))throw new Error('Invalid "target" attribute. You can\'t cut text from elements with "readonly" or "disabled" attributes');this._target=t}},get:function e(){return this._target}}]),e}();e.exports=c})},{select:5}],8:[function(t,n,o){!function(i,r){if("function"==typeof e&&e.amd)e(["module","./clipboard-action","tiny-emitter","good-listener"],r);else if("undefined"!=typeof o)r(n,t("./clipboard-action"),t("tiny-emitter"),t("good-listener"));else{var a={exports:{}};r(a,i.clipboardAction,i.tinyEmitter,i.goodListener),i.clipboard=a.exports}}(this,function(e,t,n,o){"use strict";function i(e){return e&&e.__esModule?e:
 {default:e}}function r(e,t){if(!(e instanceof t))throw new TypeError("Cannot call a class as a function")}function a(e,t){if(!e)throw new ReferenceError("this hasn't been initialised - super() hasn't been called");return!t||"object"!=typeof t&&"function"!=typeof t?e:t}function c(e,t){if("function"!=typeof t&&null!==t)throw new TypeError("Super expression must either be null or a function, not "+typeof t);e.prototype=Object.create(t&&t.prototype,{constructor:{value:e,enumerable:!1,writable:!0,configurable:!0}}),t&&(Object.setPrototypeOf?Object.setPrototypeOf(e,t):e.__proto__=t)}function l(e,t){var n="data-clipboard-"+e;if(t.hasAttribute(n))return t.getAttribute(n)}var u=i(t),s=i(n),f=i(o),d=function(){function e(e,t){for(var n=0;n<t.length;n++){var o=t[n];o.enumerable=o.enumerable||!1,o.configurable=!0,"value"in o&&(o.writable=!0),Object.defineProperty(e,o.key,o)}}return function(t,n,o){return n&&e(t.prototype,n),o&&e(t,o),t}}(),h=function(e){function t(e,n){r(this,t);var o=a(this,(t
 .__proto__||Object.getPrototypeOf(t)).call(this));return o.resolveOptions(n),o.listenClick(e),o}return c(t,e),d(t,[{key:"resolveOptions",value:function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:{};this.action="function"==typeof t.action?t.action:this.defaultAction,this.target="function"==typeof t.target?t.target:this.defaultTarget,this.text="function"==typeof t.text?t.text:this.defaultText}},{key:"listenClick",value:function e(t){var n=this;this.listener=(0,f.default)(t,"click",function(e){return n.onClick(e)})}},{key:"onClick",value:function e(t){var n=t.delegateTarget||t.currentTarget;this.clipboardAction&&(this.clipboardAction=null),this.clipboardAction=new u.default({action:this.action(n),target:this.target(n),text:this.text(n),trigger:n,emitter:this})}},{key:"defaultAction",value:function e(t){return l("action",t)}},{key:"defaultTarget",value:function e(t){var n=l("target",t);if(n)return document.querySelector(n)}},{key:"defaultText",value:function e(t){r
 eturn l("text",t)}},{key:"destroy",value:function e(){this.listener.destroy(),this.clipboardAction&&(this.clipboardAction.destroy(),this.clipboardAction=null)}}],[{key:"isSupported",value:function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:["copy","cut"],n="string"==typeof t?[t]:t,o=!!document.queryCommandSupported;return n.forEach(function(e){o=o&&!!document.queryCommandSupported(e)}),o}}]),t}(s.default);e.exports=h})},{"./clipboard-action":7,"good-listener":4,"tiny-emitter":6}]},{},[8])(8)});
diff --git a/docs/_static/js/copycode.js b/docs/_static/js/copycode.js
index f9ebd64abb..b1c268cfec 100644
--- a/docs/_static/js/copycode.js
+++ b/docs/_static/js/copycode.js
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /*Copy code to clipboard*/
 LANG_GP = {'default':'>>> ', 'python':'>>> ' , 'scala':'scala>', 'julia':'julia> ', 'r':'> ', 'perl':'pdl>' , 'cpp':'', 'bash':'$ '};
 
diff --git a/docs/_static/js/navbar.js b/docs/_static/js/navbar.js
index ee011bd598..e3601c409e 100644
--- a/docs/_static/js/navbar.js
+++ b/docs/_static/js/navbar.js
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 var searchBox = $("#search-input-wrap");
 var TITLE = ['/get_started/', '/tutorials/', '/gluon/' , '/api/', '/community/contribute.html', ];
 var DOC_TITLE = ['/faq/', '/architecture/', '/model_zoo/'];
@@ -87,4 +106,4 @@ $(document).ready(function () {
         if($("body").prop("clientWidth") < 1000 || $('div.sphinxsidebar').css('visibility') == 'hidden') $('div.content').css('width', '100%');
         else $('div.content').css('width', 'calc(100% - 300px)');
     });
-});
\ No newline at end of file
+});
diff --git a/docs/_static/js/options.js b/docs/_static/js/options.js
index 77ef94074c..6e285df886 100644
--- a/docs/_static/js/options.js
+++ b/docs/_static/js/options.js
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 $(document).ready(function () {
     function label(lbl) {
         return lbl.replace(/[ .]/g, '-').toLowerCase();
diff --git a/docs/_static/js/page.js b/docs/_static/js/page.js
index 24fa2159a1..9054bf49ca 100644
--- a/docs/_static/js/page.js
+++ b/docs/_static/js/page.js
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /* Generate url tracking for each page */
 var protocol = location.protocol.concat("//");
 var host = protocol.concat(window.location.host);
@@ -67,4 +86,4 @@ if ($('div.download-btn').length > 0) {
 var footerHeight = 252;
 if ($('div.content-block').height() > $(window).height() - footerHeight) {
     $('div.footer').css('position', 'relative');
-}
\ No newline at end of file
+}
diff --git a/docs/_static/js/search.js b/docs/_static/js/search.js
index 9df9702225..e9c6e84410 100644
--- a/docs/_static/js/search.js
+++ b/docs/_static/js/search.js
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 $(document).ready(function () {
     var searchForm = $("#search-input-wrap").children("form").first();
     searchForm.append('<div class="form-group searchBtn"><input type="submit" class="form-control" value="Go"></div>');
@@ -16,4 +35,4 @@ $(document).ready(function () {
             $('#searchIcon span').addClass('glyphicon-search');
         }
     });
-});
\ No newline at end of file
+});
diff --git a/docs/_static/js/sidebar.js b/docs/_static/js/sidebar.js
index 31e1450154..890f8c36ad 100644
--- a/docs/_static/js/sidebar.js
+++ b/docs/_static/js/sidebar.js
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /*Preprocess*/
 var LANG = ['python', 'scala', 'r', 'julia', 'c++', 'perl'];
 var TITLE_WITH_LANG = ['/get_started/', '/tutorials/', '/faq/', '/architecture/', '/community/'];
@@ -251,4 +270,4 @@ $(document).ready(function () {
         if ($('div.sphinxsidebar').css('visibility') == 'hidden') $('.content').css('width', '100%');
         return;
     }
-});
\ No newline at end of file
+});
diff --git a/docs/_static/mxnet-theme/footer.html b/docs/_static/mxnet-theme/footer.html
index c5ac70e452..76d694e8b3 100644
--- a/docs/_static/mxnet-theme/footer.html
+++ b/docs/_static/mxnet-theme/footer.html
@@ -1,3 +1,22 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <div class="footer">
 <div class="section-disclaimer">
 <div class="container">
@@ -13,4 +32,4 @@
     </div>
 </div>
 </div>
-</div>
\ No newline at end of file
+</div>
diff --git a/docs/_static/mxnet-theme/index.html b/docs/_static/mxnet-theme/index.html
index da5cea6f95..40bd6dff5e 100644
--- a/docs/_static/mxnet-theme/index.html
+++ b/docs/_static/mxnet-theme/index.html
@@ -1,3 +1,22 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <div id="splash">
   <div class="container">
     <div class="row">
diff --git a/docs/_static/mxnet-theme/layout.html b/docs/_static/mxnet-theme/layout.html
index b776117e79..3d5df27077 100644
--- a/docs/_static/mxnet-theme/layout.html
+++ b/docs/_static/mxnet-theme/layout.html
@@ -1,3 +1,22 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 {%- block doctype -%}
 <!DOCTYPE html>
 {%- endblock %}
diff --git a/docs/_static/mxnet-theme/navbar.html b/docs/_static/mxnet-theme/navbar.html
index e5619f17fc..fc483f83c9 100644
--- a/docs/_static/mxnet-theme/navbar.html
+++ b/docs/_static/mxnet-theme/navbar.html
@@ -1,3 +1,22 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <div class="navbar navbar-fixed-top">
   <div class="container" id="navContainer">
     <div id="header-inner" class="innder">
diff --git a/docs/_static/mxnet.css b/docs/_static/mxnet.css
index 1f97f9e8ae..fec4e45539 100644
--- a/docs/_static/mxnet.css
+++ b/docs/_static/mxnet.css
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /*-------------------- AmazonEmber font -----------------------------------*/
 @font-face {
     font-family: AmazonEmber;
diff --git a/docs/_static/searchtools_custom.js b/docs/_static/searchtools_custom.js
index 42c4493995..fe1e621011 100644
--- a/docs/_static/searchtools_custom.js
+++ b/docs/_static/searchtools_custom.js
@@ -8,14 +8,14 @@
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are
  * met:
- * 
+ *
  * * Redistributions of source code must retain the above copyright
  *   notice, this list of conditions and the following disclaimer.
- * 
+ *
  * * Redistributions in binary form must reproduce the above copyright
  *   notice, this list of conditions and the following disclaimer in the
  *   documentation and/or other materials provided with the distribution.
- * 
+ *
  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
diff --git a/docs/_static/selectlang.js b/docs/_static/selectlang.js
index 25337abcb2..86fbd10822 100644
--- a/docs/_static/selectlang.js
+++ b/docs/_static/selectlang.js
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 function changeLanguage(langSelect, langSelectLabel, rootpath){
 	langSelect.change(function() {
 		var lang = langSelect.val();
@@ -22,4 +41,4 @@ $(document).ready(function () {
 	langSelectLabel.text($("option:selected").text());
 
 	changeLanguage(langSelect, langSelectLabel, getRootPath());
-})
\ No newline at end of file
+})
diff --git a/docs/_static/us.svg b/docs/_static/us.svg
index 1d621f96d8..f410544e3e 100644
--- a/docs/_static/us.svg
+++ b/docs/_static/us.svg
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <svg id="svg153" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.w3.org/2000/svg" height="480" width="640" version="1.1" xmlns:cc="http://creativecommons.org/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <metadata id="metadata3151">
   <rdf:RDF>
diff --git a/docs/build_version_doc/build_all_version.sh b/docs/build_version_doc/build_all_version.sh
index 2d33bd72c4..bf02a62a15 100755
--- a/docs/build_version_doc/build_all_version.sh
+++ b/docs/build_version_doc/build_all_version.sh
@@ -21,7 +21,7 @@
 # Built files are stored in $built
 # Version numbers are stored in $tag_list.
 # Version numbers are ordered from latest to old and final one is master.
-tag_list="0.12.0 0.11.0 master"
+tag_list="1.0.0 0.12.0 0.11.0 master"
 
 mxnet_url="https://github.com/apache/incubator-mxnet.git"
 mxnet_folder="apache_mxnet"
diff --git a/docs/faq/gradient_compression.md b/docs/faq/gradient_compression.md
new file mode 100644
index 0000000000..4cd58f05d5
--- /dev/null
+++ b/docs/faq/gradient_compression.md
@@ -0,0 +1,107 @@
+# Gradient Compression
+
+Gradient Compression reduces communication bandwidth, and in some scenarios, it can make training more scalable and efficient without significant loss in convergence rate or accuracy. Example implementations with GPUs, CPUs, and distributed training are provided in this document. 
+
+
+## Benefits
+
+**Increased Speed**
+
+For architectures with fully connected layers, the gradient compression capability is observed to speedup training by about 2x, depending on the size of the model and the network bandwidth of the instance. Bigger models see larger speedup with gradient compression.
+
+**Minimal Accuracy Loss**
+
+Gradient compression uses the approach of delaying the synchronization of weight updates which are small. Although small weight updates might not be sent for that batch, this information is not discarded. Once the weight updates for this location accumulate to become a larger value, they will be propagated. Since there is no information loss, but only delayed updates, it does not lead to a significant loss in accuracy or convergence rate. In distributed training experiments[1], the accuracy loss observed due to gradient compression was as low as 1%
+
+
+## When to Use Gradient Compression
+
+When training models whose architectures include large fully connected components, it can be helpful to use gradient compression. For larger models, as well as recurrent neural networks, the communication cost becomes a major factor. Such models stand to benefit greatly with gradient compression.
+
+
+### GPU versus CPU
+
+The greatest benefits from gradient compression are realized when using multi-node (single or multi-GPU) distributed training. Training on CPU would provide a lower compute density per compute node as compared to the massive compute density per compute node on a GPU. Due to this, the required communication bandwidth for CPU-based nodes during training is not as high as for GPU-based nodes. Hence, the benefits of gradient compression are lower for CPU-based nodes as compared to GPU-based nodes.
+
+
+### Network Latency
+
+Benefits of gradient compression can be found when using distributed training with network connected nodes. Depending on the network latency between nodes and the model's size, these can contribute to slow performance such that gradient compression may provide speed improvements.
+
+You may not want to use gradient compression if you have low latency network communication.
+
+
+### Model Size
+
+Distributed training involves synchronization of weights after each batch. Larger models have much higher communication costs during training, hence such models stand to benefit much more from gradient compression.
+When running distributed training with gradient compression, the quantize and dequantize operations happen on CPU parallelized with OpenMP. For smaller models, when training on GPUs, it helps to set `OMP_NUM_THREADS=1` on each node, so that the overhead of launching OMP threads doesn't cause the compression and decompression to be slow.
+
+### Model Architecture
+
+The communication bandwidth requirements during training vary across various neural network architectures and hence the benefits of gradient compression vary accordingly.
+
+In networks which have significant fully connected components, since such layers have low compute cost on GPUs, communication becomes a bottleneck limiting the speed of distributed training. Gradient compression can help reduce the communication cost, and thus speed up training in such cases. We have observed speedup of about 2x on large fully connected neural networks. Models like AlexNet and VGG have large fully connected components as part of the network, hence stand to benefit from gradient compression. As with these models, Long Short-Term Memory architectures require more communication bandwidth, so they also exhibit speed improvements with gradient compression.
+
+Architectures like Convolutional Neural Networks on the other hand have a higher compute cost, in which case some communication can be parallelized with computation. Since communication is not the bottleneck in such networks, gradient compression doesn't help much.
+
+
+### Single Node Gradient Compression
+
+When the training is configured to use device to device communication on a single node with multiple GPUs, gradient compression can be used to reduce the cost of communication. This can provide about 20% speedup for large models using older generation architectures. However, speed benefits may be negligible on a machine with a newer generation architecture where GPUs can communicate at low latency.
+
+
+## Approach
+
+The idea behind gradient compression comes from two observations:
+
+First, when training large neural networks, the gradients of weights computed for a small mini-batch of training data are typically sparse. Only a small fraction of the weights have significant updates after each mini-batch. The synchronization of updates that are near zero can be safely delayed longer than the typical mini-batch size. This essentially means that the rate of weight-update can vary depending on the value of an individual weight.
+
+Secondly, gradients can be compressed significantly by considering only those gradient elements whose absolute values exceed a threshold, and then quantizing them to use lower bits per gradient value. By compressing the gradients, we can reduce communication bandwidth. The delayed gradient values, in the form of quantization error and values that don't meet the threshold, are aggregated into a gradient residual which is communicated when it reaches the threshold.
+
+## Technical Implementation
+
+### Two Bit Quantization
+
+Currently the supported type of quantization uses two bits for each gradient value. Any positive value greater than or equal to the threshold sets two bits as `11`, any negative value whose absolute value is greater or equal to the threshold sets two bits as `10`, and others are set to `00`. This enables us to store 16 quantized gradients as one float. The error in quantization, which is `original_value - quantized_value` is stored in the form of a gradient residual.
+
+### Types of Kvstore
+
+Supported types of `kvstore` are `device` and all distributed kvstores such as `dist_sync`, `dist_async`, and `dist_sync_device`. When `kvstore` is `device`, the communication between GPUs is compressed. Please note that this increases the memory usage of GPUs because of the additional residual stored. When using a distributed kvstore, worker-to-server communication is compressed. In this case, compression and decompression happen on the CPU, and gradient residuals will be stored on the CPU. Server-to-worker communication and device-to-device communication are not compressed to avoid multiple levels of compression.
+
+## Enabling the Gradient Compression in MXNet
+
+Gradient compression is a run-time configuration parameter to be enabled during training. Here are the MXNet APIs to enable gradient compression:
+
+**Gluon API**:
+
+```
+trainer = gluon.Trainer(..., compression_params={'type?:'2bit', 'threshold':0.5})
+```
+A reference `gluon` implementation with a gradient compression option can be found in the [train.py script from a word-level language modeling RNN example](https://github.com/apache/incubator-mxnet/blob/master/example/gluon/word_language_model/train.py).
+
+**Module API**:
+
+```
+mod = mx.mod.Module(..., compression_params={'type?:'2bit', 'threshold':0.5})
+```
+
+A `module` example is provided with [this guide for setting up MXNet with distributed training](https://mxnet.incubator.apache.org/versions/master/how_to/multi_devices.html#distributed-training-with-multiple-machines). It comes with the option of turning on gradient compression as an argument to the [train_mnist.py script](https://github.com/apache/incubator-mxnet/blob/master/example/image-classification/train_mnist.py).
+
+### Configuration Details
+
+**Threshold**
+
+A default `threshold` value of `0.5` is good for most use cases, but to get the most benefit from gradient compression for a particular scenario, it can be beneficial to experiment. If the threshold is set to a very large value, say `10.0`, then the updates become too infrequent and the training will converge slower. Setting the threshold automatically is expected in a future release.
+
+**Quantization**
+
+This release supports 2-bit quantization for encoding of gradients to reduce the communication bandwidth during training. Future releases will support 1-bit quantization and other approaches for encoding of gradients based on experimental evidence of benefits and user demand.
+
+**Sparse Format**
+
+We believe that the density of data will need to be really low (i.e. around > 90% zeros) to reap benefits of the sparse format. However, this is an area of experimentation that will be explored in a future release.
+
+
+## References
+
+1. [Nikko Storm, Amazon.com, Scalable Distributed Training using commodity GPU cloud computing.](https://s3-us-west-2.amazonaws.com/amazon.jobs-public-documents/strom_interspeech2015.pdf)
diff --git a/docs/faq/index.md b/docs/faq/index.md
index e29bda0b68..68c7d41cb8 100644
--- a/docs/faq/index.md
+++ b/docs/faq/index.md
@@ -14,12 +14,15 @@ and full working examples, visit the [tutorials section](../tutorials/index.md).
 * [How do I visualize neural networks as computation graphs?](http://mxnet.io/how_to/visualize_graph.html)
 
 
-## Speed
-
+## Scale
 * [How can I train with multiple CPU/GPUs with data parallelism?](http://mxnet.io/how_to/multi_devices.html)
 
 * [How can I train with multiple GPUs with model parallelism?](http://mxnet.io/how_to/model_parallel_lstm.html)
 
+
+## Speed
+* [How do I use gradient compression with distributed training?](http://mxnet.io/how_to/gradient_compression.html)
+
 * [Can I use nnpack to improve the CPU performance of MXNet?](http://mxnet.io/how_to/nnpack.html)
 
 * [What are the best setup and data-handling tips and tricks for improving speed?](http://mxnet.io/how_to/perf.html)
@@ -41,7 +44,7 @@ and full working examples, visit the [tutorials section](../tutorials/index.md).
 * [How to convert MXNet models to Apple CoreML format?](https://github.com/apache/incubator-mxnet/tree/master/tools/coreml)
 
 ## Security
-* [How to run MXNet securely?](http://mxnet.io/how_to/security.md)
+* [How to run MXNet securely?](http://mxnet.io/how_to/security.html)
 
 ## Extend and Contribute to MXNet
 
diff --git a/docs/faq/multi_devices.md b/docs/faq/multi_devices.md
index 3272062243..c79d1f80be 100644
--- a/docs/faq/multi_devices.md
+++ b/docs/faq/multi_devices.md
@@ -167,6 +167,19 @@ python ../../tools/launch.py -n 2 -H hosts --sync-dst-dir /tmp/mxnet \
    python train_mnist.py --network lenet --kv-store dist_sync
 ```
 
+
+### Gradient compression
+
+If your model has fully connected components or recurrent neural networks, you may achieve increased training speed using gradient compression with potentially slight loss of accuracy. Please see [Gradient Compression](https://mxnet.incubator.apache.org/versions/master/faq/gradient_compression.html) for more details on when and how to use it. For the above example, gradient compression can be enabled by running the following:
+
+```bash
+python ../../tools/launch.py -n 2 --launcher ssh -H hosts python train_mnist.py --network lenet \
+    --kv-store dist_sync --gc-type 2bit
+```
+
+In this example, `gc-type` has been set to `2bit`, to enable two bit gradient compression.
+
+
 ### Use a Particular Network Interface
 
 _MXNet_ often chooses the first available network interface.
diff --git a/docs/tutorials/basic/ndarray_indexing.md b/docs/tutorials/basic/ndarray_indexing.md
new file mode 100644
index 0000000000..37168b3401
--- /dev/null
+++ b/docs/tutorials/basic/ndarray_indexing.md
@@ -0,0 +1,377 @@
+
+# NDArray Indexing - Array indexing features
+
+MXNet's advanced indexing features are modeled after [NumPy's implementation and documentation](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing). You will see direct adaptations of many NumPy indexing features and examples which are close, if not identical, so we borrow much from their documentation.
+
+`NDArray`s can be indexed using the standard Python `x[obj]` syntax, where _x_ is the array and _obj_ the selection.
+
+There are two kinds of indexing available:
+
+1. basic slicing
+1. advanced indexing
+
+In MXNet, we support both basic and advanced indexing following the convention of indexing NumPy's `ndarray`.
+
+
+## Basic Slicing and Indexing
+
+Basic slicing extends Python?s basic concept of slicing to N dimensions. For a quick review:
+
+```
+a[start:end] # items start through end-1
+a[start:]    # items start through the rest of the array
+a[:end]      # items from the beginning through end-1
+a[:]         # a copy of the whole array
+```
+
+
+```python
+from mxnet import nd
+```
+
+For some working examples of basic slicing we'll start simple.
+
+
+```python
+x = nd.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int32')
+x[5:]
+```
+
+
+
+
+
+    [5 6 7 8 9]
+    <NDArray 5 @cpu(0)>
+
+
+
+
+```python
+x = nd.array([0, 1, 2, 3])
+print('1D complete array, x=', x)
+s = x[1:3]
+print('slicing the 2nd and 3rd elements, s=', s)
+```
+
+    1D complete array, x=
+    [ 0.  1.  2.  3.]
+    <NDArray 4 @cpu(0)>
+    slicing the 2nd and 3rd elements, s=
+    [ 1.  2.]
+    <NDArray 2 @cpu(0)>
+
+
+Now let's try slicing the 2nd and 3rd elements of a multi-dimensional array.
+
+
+```python
+x = nd.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
+print('multi-D complete array, x=', x)
+s = x[1:3]
+print('slicing the 2nd and 3rd elements, s=', s)
+```
+
+    multi-D complete array, x=
+    [[  1.   2.   3.   4.]
+     [  5.   6.   7.   8.]
+     [  9.  10.  11.  12.]]
+    <NDArray 3x4 @cpu(0)>
+    slicing the 2nd and 3rd elements, s=
+    [[  5.   6.   7.   8.]
+     [  9.  10.  11.  12.]]
+    <NDArray 2x4 @cpu(0)>
+
+
+Now let's try writing to a specific element. We'll write `9` to element `2` using `x[2] = 9.0`, which will update the whole row.
+
+
+```python
+print('original x, x=', x)
+x[2] = 9.0
+print('replaced entire row with x[2] = 9.0, x=', x)
+```
+
+    original x, x=
+    [[  1.   2.   3.   4.]
+     [  5.   6.   7.   8.]
+     [  9.  10.  11.  12.]]
+    <NDArray 3x4 @cpu(0)>
+    replaced entire row with x[2] = 9.0, x=
+    [[ 1.  2.  3.  4.]
+     [ 5.  6.  7.  8.]
+     [ 9.  9.  9.  9.]]
+    <NDArray 3x4 @cpu(0)>
+
+
+We can target specific elements too. Let's replace the number `3` in the first row with the number `9` using `x[0, 2] = 9.0`.
+
+
+```python
+print('original x, x=', x)
+x[0, 2] = 9.0
+print('replaced specific element with x[0, 2] = 9.0, x=', x)
+```
+
+    original x, x=
+    [[ 1.  2.  3.  4.]
+     [ 5.  6.  7.  8.]
+     [ 9.  9.  9.  9.]]
+    <NDArray 3x4 @cpu(0)>
+    replaced specific element with x[0, 2] = 9.0, x=
+    [[ 1.  2.  9.  4.]
+     [ 5.  6.  7.  8.]
+     [ 9.  9.  9.  9.]]
+    <NDArray 3x4 @cpu(0)>
+
+
+Now lets target even more by selecting a couple of targets at the same time. We'll replace the `6` and the `7` with `x[1:2, 1:3] = 5.0`.
+
+
+```python
+print('original x, x=', x)
+x[1:2, 1:3] = 5.0
+print('replaced range of elements with x[1:2, 1:3] = 5.0, x=', x)
+```
+
+    original x, x=
+    [[ 1.  2.  9.  4.]
+     [ 5.  6.  7.  8.]
+     [ 9.  9.  9.  9.]]
+    <NDArray 3x4 @cpu(0)>
+    replaced range of elements with x[1:2, 1:3] = 5.0, x=
+    [[ 1.  2.  9.  4.]
+     [ 5.  5.  5.  8.]
+     [ 9.  9.  9.  9.]]
+    <NDArray 3x4 @cpu(0)>
+
+
+## New Indexing Features in v1.0
+
+### Step
+
+The basic slice syntax is `i:j:k` where _i_ is the starting index, _j_ is the stopping index, and _k_ is the step (_k_ must be nonzero).
+
+**Note**: Previously, MXNet supported basic slicing and indexing only with `step=1`. From release 1.0, arbitrary values of `step` are supported.
+
+
+```python
+x = nd.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int32')
+# Select elements 1 through 7, and use a step of 2
+x[1:7:2]
+```
+
+
+
+
+
+    [1 3 5]
+    <NDArray 3 @cpu(0)>
+
+
+
+## Negative Indices
+Negative _i_ and _j_ are interpreted as _n + i_ and _n + j_ where _n_ is the number of elements in the corresponding dimension. Negative _k_ makes stepping go towards smaller indices.
+
+
+```python
+x[-2:10]
+```
+
+
+
+
+
+    [8 9]
+    <NDArray 2 @cpu(0)>
+
+
+
+If the number of objects in the selection tuple is less than N , then : is assumed for any subsequent dimensions.
+
+
+```python
+x = nd.array([[[1],[2],[3]],
+              [[4],[5],[6]]], dtype='int32')
+x[1:2]
+```
+
+
+
+
+
+    [[[4]
+      [5]
+      [6]]]
+    <NDArray 1x3x1 @cpu(0)>
+
+
+
+You may use slicing to set values in the array, but (unlike lists) you can never grow the array. The size of the value to be set in `x[obj] = value` must be able to broadcast to the same shape as `x[obj]`.
+
+
+```python
+x = nd.arange(16, dtype='int32').reshape((4, 4))
+print(x)
+```
+
+
+    [[ 0  1  2  3]
+     [ 4  5  6  7]
+     [ 8  9 10 11]
+     [12 13 14 15]]
+    <NDArray 4x4 @cpu(0)>
+
+
+
+```python
+print(x[1:4:2, 3:0:-1])
+```
+
+
+    [[ 7  6  5]
+     [15 14 13]]
+    <NDArray 2x3 @cpu(0)>
+
+
+
+```python
+x[1:4:2, 3:0:-1] = [[16], [17]]
+print(x)
+```
+
+
+    [[ 0  1  2  3]
+     [ 4 16 16 16]
+     [ 8  9 10 11]
+     [12 17 17 17]]
+    <NDArray 4x4 @cpu(0)>
+
+
+## New Advanced Indexing Features in v1.0
+
+Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object (e.g. a Python list), a NumPy `ndarray` (of data type integer), an MXNet `NDArray`, or a tuple with at least one sequence object.
+
+Advanced indexing always returns a __copy__ of the data.
+
+**Note**:
+- When the selection object is a Python list, it must be a list of integers. MXNet does not support the selection object being a nested list. That is, `x[[1, 2]]` is supported, while `x[[1], [2]]` is not.
+- When the selection object is a NumPy `ndarray` or an MXNet `NDArray`, there is no dimension restrictions on the object.
+- When the selection object is a tuple containing Python list(s), both integer lists and nested lists are supported. That is, both `x[1:4, [1, 2]]` and `x[1:4, [[1], [2]]` are supported.
+
+### Purely Integer Array Indexing
+When the index consists of as many integer arrays as the array being indexed has dimensions, the indexing is straight forward, but different from slicing.
+
+Advanced indexes always are [broadcast](https://docs.scipy.org/doc/numpy-1.13.0/reference/ufuncs.html#ufuncs-broadcasting) and iterated as one:
+```python
+result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],
+                           ..., ind_N[i_1, ..., i_M]]
+```
+Note that the result shape is identical to the (broadcast) indexing array shapes `ind_1, ..., ind_N`.
+
+**Example:**
+From each row, a specific element should be selected. The row index is just [0, 1, 2] and the column index specifies the element to choose for the corresponding row, here [0, 1, 0]. Using both together the task can be solved using advanced indexing:
+
+
+```python
+x = nd.array([[1, 2],
+              [3, 4],
+              [5, 6]], dtype='int32')
+x[[0, 1, 2], [0, 1, 0]]
+```
+
+
+
+
+
+    [1 4 5]
+    <NDArray 3 @cpu(0)>
+
+
+
+To achieve a behavior similar to the basic slicing above, broadcasting can be used. This is best understood with an example.
+
+Example:
+From a 4x3 array the corner elements should be selected using advanced indexing. Thus all elements for which the column is one of `[0, 2]` and the row is one of `[0, 3]` need to be selected. To use advanced indexing one needs to select all elements explicitly. Using the method explained previously one could write:
+
+
+```python
+x = nd.array([[ 0,  1,  2],
+              [ 3,  4,  5],
+              [ 6,  7,  8],
+              [ 9, 10, 11]], dtype='int32')
+x[[[0, 0], [3, 3]],
+  [[0, 2], [0, 2]]]
+```
+
+
+
+
+
+    [[ 0  2]
+     [ 9 11]]
+    <NDArray 2x2 @cpu(0)>
+
+
+
+However, since the indexing arrays above just repeat themselves, broadcasting can be used.
+
+
+```python
+x[[[0], [3]],
+  [[0, 2]]]
+```
+
+
+
+
+
+    [[ 0  2]
+     [ 9 11]]
+    <NDArray 2x2 @cpu(0)>
+
+
+
+### Combining Advanced and Basic Indexing
+There are three situations we need to consider when mix advanced and basic indices in a single selection object. Let's look at examples to understand each one's behavior.
+
+- There is only one advanced index in the selection object. For example, `x` is an `NDArray` with `shape=(10, 20, 30, 40, 50)` and `result=x[:, :, ind]` has one advanced index `ind` with `shape=(2, 3, 4)` on the third axis. The `result` will have `shape=(10, 20, 2, 3, 4, 40, 50)` because the subspace of `x` in the third dimension is replaced by the subspace of `shape=(2, 3, 4)`. If we let _i_, _j_, _k_ loop over the (2, 3, 4)-shaped subspace, it is equivalent to `result[:, :, i, j, k, :, :] = x[:, :, ind[i, j, k], :, :]`.
+
+
+```python
+import numpy as np
+shape = (10, 20, 30, 40, 50)
+x = nd.arange(np.prod(shape), dtype='int32').reshape(shape)
+ind = nd.arange(24).reshape((2, 3, 4))
+print(x[:, :, ind].shape)
+```
+
+    (10, 20, 2, 3, 4, 40, 50)
+
+
+- There are at least two advanced indices in the selection object, and all the advanced indices are adjacent to each other. For example, `x` is an `NDArray` with `shape=(10, 20, 30, 40, 50)` and `result=x[:, :, ind1, ind2, :]` has two advanced indices with shapes that are broadcastable to `shape=(2, 3, 4)`. Then the `result` has `shape=(10, 20, 2, 3, 4, 50)` because `(30, 40)`-shaped subspace has been replaced with `(2, 3, 4)`-shaped subspace from the indices.
+
+
+```python
+ind1 = [0, 1, 2, 3]
+ind2 = [[[0], [1], [2]], [[3], [4], [5]]]
+print(x[:, :, ind1, ind2, :].shape)
+```
+
+    (10, 20, 2, 3, 4, 50)
+
+
+- There are at least two advanced indices in the selection object, and there is at least one advanced index separated from the others by basic indices. For example,  `x` is an `NDArray` with `shape=(10, 20, 30, 40, 50)` and `result=x[:, :, ind1, :, ind2]` has two advanced indices with shapes that are broadcastable to `shape=(2, 3, 4)`. Then the `result` has `shape=(2, 3, 4, 10, 20, 40)` because there is no unambiguous place to place the indexing subspace, hence it is prepended to the beginning.
+
+
+```python
+print(x[:, :, ind1, :, ind2].shape)
+```
+
+    (2, 3, 4, 10, 20, 40)
+
+## References
+
+[NumPy documentation](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing)
+
+<!-- INSERT SOURCE DOWNLOAD BUTTONS -->
diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md
index 6429dfb31b..d20a821193 100644
--- a/docs/tutorials/index.md
+++ b/docs/tutorials/index.md
@@ -1,14 +1,11 @@
 # Tutorials
 
-These tutorials introduce a few fundamental concepts in deep learning and how to implement them in _MXNet_. The _Basics_ section contains tutorials on manipulating arrays, building networks, loading/preprocessing data, etc. The _Training and Inference_ section talks about implementing Linear Regression, training a Handwritten digit classifier using MLP and CNN, running inferences using a pre-trained model, and lastly, efficiently training a large scale image classifier.
-
-
 ## Gluon
 
 Gluon is the high-level interface for MXNet. It is more intuitive and easier to use than the lower level interface.
 Gluon supports dynamic (define-by-run) graphs with JIT-compilation to achieve both flexibility and efficiency.
-This is a selected subset of Gluon tutorials. For the comprehensive tutorial on Gluon,
-please see [gluon.mxnet.io](http://gluon.mxnet.io).
+
+This is a selected subset of Gluon tutorials that explains basic usage of Gluon and fundamental concepts in deep learning. For the comprehensive tutorial on Gluon that covers topics from basic statistics and probability theory to reinforcement learning and recommender systems, please see [gluon.mxnet.io](http://gluon.mxnet.io). 
 
 ### Basics
 
@@ -32,6 +29,8 @@ please see [gluon.mxnet.io](http://gluon.mxnet.io).
 
 ## MXNet
 
+These tutorials introduce a few fundamental concepts in deep learning and how to implement them in _MXNet_. The _Basics_ section contains tutorials on manipulating arrays, building networks, loading/preprocessing data, etc. The _Training and Inference_ section talks about implementing Linear Regression, training a Handwritten digit classifier using MLP and CNN, running inferences using a pre-trained model, and lastly, efficiently training a large scale image classifier.
+
 ### Basics
 
 ```eval_rst
@@ -39,6 +38,7 @@ please see [gluon.mxnet.io](http://gluon.mxnet.io).
    :maxdepth: 1
 
    basic/ndarray
+   basic/ndarray_indexing
    basic/symbol
    basic/module
    basic/data
diff --git a/example/gluon/word_language_model/get_ptb_data.sh b/example/gluon/word_language_model/get_ptb_data.sh
index d2641cb32b..0a0c7051b0 100755
--- a/example/gluon/word_language_model/get_ptb_data.sh
+++ b/example/gluon/word_language_model/get_ptb_data.sh
@@ -17,6 +17,11 @@
 # specific language governing permissions and limitations
 # under the License.
 
+echo ""
+echo "NOTE: Please review the licensing of the datasets in this script before proceeding"
+echo "See https://catalog.ldc.upenn.edu/ldc99t42 for the licensing"
+echo "Once that is done, please uncomment the wget commands in this script"
+echo ""
 
 RNN_DIR=$(cd `dirname $0`; pwd)
 DATA_DIR="${RNN_DIR}/data/"
@@ -26,7 +31,7 @@ if [[ ! -d "${DATA_DIR}" ]]; then
   mkdir -p ${DATA_DIR}
 fi
 
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
diff --git a/example/gluon/word_language_model/train.py b/example/gluon/word_language_model/train.py
index 0b504998be..b419277dcf 100644
--- a/example/gluon/word_language_model/train.py
+++ b/example/gluon/word_language_model/train.py
@@ -54,6 +54,11 @@
                     help='report interval')
 parser.add_argument('--save', type=str, default='model.params',
                     help='path to save the final model')
+parser.add_argument('--gctype', type=str, default='none',
+                    help='type of gradient compression to use, \
+                          takes `2bit` or `none` for now.')
+parser.add_argument('--gcthreshold', type=float, default=0.5,
+                    help='threshold for 2bit gradient compression')
 args = parser.parse_args()
 
 
@@ -90,10 +95,13 @@ def batchify(data, batch_size):
 model = model.RNNModel(args.model, ntokens, args.emsize, args.nhid,
                        args.nlayers, args.dropout, args.tied)
 model.collect_params().initialize(mx.init.Xavier(), ctx=context)
+
+compression_params = None if args.gctype == 'none' else {'type': args.gctype, 'threshold': args.gcthreshold}
 trainer = gluon.Trainer(model.collect_params(), 'sgd',
                         {'learning_rate': args.lr,
                          'momentum': 0,
-                         'wd': 0})
+                         'wd': 0},
+                        compression_params=compression_params)
 loss = gluon.loss.SoftmaxCrossEntropyLoss()
 
 ###############################################################################
diff --git a/example/image-classification/predict-cpp/image-classification-predict.cc b/example/image-classification/predict-cpp/image-classification-predict.cc
index d3b875638d..a4a968ee3c 100644
--- a/example/image-classification/predict-cpp/image-classification-predict.cc
+++ b/example/image-classification/predict-cpp/image-classification-predict.cc
@@ -1,22 +1,3 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
 /*!
  *  Copyright (c) 2015 by Xiao Liu, pertusa, caprice-j
  * \file image_classification-predict.cpp
diff --git a/example/model-parallel-lstm/get_ptb_data.sh b/example/model-parallel-lstm/get_ptb_data.sh
index d2641cb32b..0a0c7051b0 100755
--- a/example/model-parallel-lstm/get_ptb_data.sh
+++ b/example/model-parallel-lstm/get_ptb_data.sh
@@ -17,6 +17,11 @@
 # specific language governing permissions and limitations
 # under the License.
 
+echo ""
+echo "NOTE: Please review the licensing of the datasets in this script before proceeding"
+echo "See https://catalog.ldc.upenn.edu/ldc99t42 for the licensing"
+echo "Once that is done, please uncomment the wget commands in this script"
+echo ""
 
 RNN_DIR=$(cd `dirname $0`; pwd)
 DATA_DIR="${RNN_DIR}/data/"
@@ -26,7 +31,7 @@ if [[ ! -d "${DATA_DIR}" ]]; then
   mkdir -p ${DATA_DIR}
 fi
 
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
diff --git a/example/rnn-time-major/get_ptb_data.sh b/example/rnn-time-major/get_ptb_data.sh
index d2641cb32b..0a0c7051b0 100755
--- a/example/rnn-time-major/get_ptb_data.sh
+++ b/example/rnn-time-major/get_ptb_data.sh
@@ -17,6 +17,11 @@
 # specific language governing permissions and limitations
 # under the License.
 
+echo ""
+echo "NOTE: Please review the licensing of the datasets in this script before proceeding"
+echo "See https://catalog.ldc.upenn.edu/ldc99t42 for the licensing"
+echo "Once that is done, please uncomment the wget commands in this script"
+echo ""
 
 RNN_DIR=$(cd `dirname $0`; pwd)
 DATA_DIR="${RNN_DIR}/data/"
@@ -26,7 +31,7 @@ if [[ ! -d "${DATA_DIR}" ]]; then
   mkdir -p ${DATA_DIR}
 fi
 
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
diff --git a/example/rnn/bucket_R/aclImdb_lstm_classification.R b/example/rnn/bucket_R/aclImdb_lstm_classification.R
index bb5eaacf26..27fe000463 100644
--- a/example/rnn/bucket_R/aclImdb_lstm_classification.R
+++ b/example/rnn/bucket_R/aclImdb_lstm_classification.R
@@ -1,3 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 require("mxnet")
 
 source("mx.io.bucket.iter.R")
@@ -13,21 +30,21 @@ batch.size <- 64
 
 num.round <- 16
 
-train.data <- mx.io.bucket.iter(buckets = corpus_bucketed_train$buckets, batch.size = batch.size, 
+train.data <- mx.io.bucket.iter(buckets = corpus_bucketed_train$buckets, batch.size = batch.size,
   data.mask.element = 0, shuffle = TRUE)
 
-eval.data <- mx.io.bucket.iter(buckets = corpus_bucketed_test$buckets, batch.size = batch.size, 
+eval.data <- mx.io.bucket.iter(buckets = corpus_bucketed_test$buckets, batch.size = batch.size,
   data.mask.element = 0, shuffle = FALSE)
 
 mx.set.seed(0)
-optimizer <- mx.opt.create("adadelta", rho = 0.92, epsilon = 1e-06, wd = 2e-04, clip_gradient = NULL, 
+optimizer <- mx.opt.create("adadelta", rho = 0.92, epsilon = 1e-06, wd = 2e-04, clip_gradient = NULL,
   rescale.grad = 1/batch.size)
 
-model_sentiment_lstm <- mx.rnn.buckets(train.data = train.data, begin.round = 1, 
-  num.round = num.round, ctx = mx.cpu(), metric = mx.metric.accuracy, optimizer = optimizer, 
-  num.rnn.layer = 2, num.embed = 16, num.hidden = 24, num.label = 2, input.size = vocab, 
-  initializer = mx.init.Xavier(rnd_type = "gaussian", factor_type = "in", magnitude = 2), 
-  dropout = 0.25, config = "seq-to-one", batch.end.callback = mx.callback.log.train.metric(period = 50), 
+model_sentiment_lstm <- mx.rnn.buckets(train.data = train.data, begin.round = 1,
+  num.round = num.round, ctx = mx.cpu(), metric = mx.metric.accuracy, optimizer = optimizer,
+  num.rnn.layer = 2, num.embed = 16, num.hidden = 24, num.label = 2, input.size = vocab,
+  initializer = mx.init.Xavier(rnd_type = "gaussian", factor_type = "in", magnitude = 2),
+  dropout = 0.25, config = "seq-to-one", batch.end.callback = mx.callback.log.train.metric(period = 50),
   verbose = TRUE)
 
 mx.model.save(model_sentiment_lstm, prefix = "model_sentiment_lstm", iteration = num.round)
diff --git a/example/rnn/bucket_R/data_preprocessing.R b/example/rnn/bucket_R/data_preprocessing.R
index c91e3fb5eb..9652077693 100644
--- a/example/rnn/bucket_R/data_preprocessing.R
+++ b/example/rnn/bucket_R/data_preprocessing.R
@@ -1,6 +1,23 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 # download the IMDB dataset
 if (!file.exists("aclImdb_v1.tar.gz")) {
-  download.file("http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
+  download.file("http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz",
     "aclImdb_v1.tar.gz")
   untar("aclImdb_v1.tar.gz")
 }
@@ -43,7 +60,7 @@ saveRDS(test_raw, file = "test_raw.rds")
 text_pre_process <- function(corpus, count_threshold = 10, dic = NULL) {
   raw_vec <- corpus
   raw_vec <- stri_enc_toascii(str = raw_vec)
-  
+
   ### remove non-printable characters
   raw_vec <- str_replace_all(string = raw_vec, pattern = "[^[:print:]]", replacement = "")
   raw_vec <- str_to_lower(string = raw_vec)
@@ -51,12 +68,12 @@ text_pre_process <- function(corpus, count_threshold = 10, dic = NULL) {
   raw_vec <- str_replace_all(string = raw_vec, pattern = "\\bbr\\b", replacement = "")
   raw_vec <- str_replace_all(string = raw_vec, pattern = "\\s+", replacement = " ")
   raw_vec <- str_trim(string = raw_vec)
-  
+
   ### Split raw sequence vectors into lists of word vectors (one list element per
   ### sequence)
-  word_vec_list <- stri_split_boundaries(raw_vec, type = "word", skip_word_none = T, 
+  word_vec_list <- stri_split_boundaries(raw_vec, type = "word", skip_word_none = T,
     skip_word_number = F, simplify = F)
-  
+
   ### Build vocabulary
   if (is.null(dic)) {
     word_vec_unlist <- unlist(word_vec_list)
@@ -66,79 +83,79 @@ text_pre_process <- function(corpus, count_threshold = 10, dic = NULL) {
     stopwords <- c(letters, "an", "the", "br")
     word_keep <- setdiff(word_keep, stopwords)
   } else word_keep <- names(dic)[!dic == 0]
-  
+
   ### Clean the sentences to keep only the curated list of words
   word_vec_list <- lapply(word_vec_list, function(x) x[x %in% word_keep])
-  
+
   # sentence_vec<- stri_split_boundaries(raw_vec, type='sentence', simplify = T)
   word_vec_length <- lapply(word_vec_list, length) %>% unlist()
-  
+
   ### Build dictionnary
   dic <- 1:length(word_keep)
   names(dic) <- word_keep
   dic <- c(`?` = 0, dic)
-  
+
   ### reverse dictionnary
   rev_dic <- names(dic)
   names(rev_dic) <- dic
-  
+
   return(list(word_vec_list = word_vec_list, dic = dic, rev_dic = rev_dic))
 }
 
-################################################################ 
+################################################################
 make_bucket_data <- function(word_vec_list, labels, dic, seq_len = c(225), right_pad = T) {
   ### Trunc sequence to max bucket length
   word_vec_list <- lapply(word_vec_list, head, n = max(seq_len))
-  
+
   word_vec_length <- lapply(word_vec_list, length) %>% unlist()
-  bucketID <- cut(word_vec_length, breaks = c(0, seq_len, Inf), include.lowest = T, 
+  bucketID <- cut(word_vec_length, breaks = c(0, seq_len, Inf), include.lowest = T,
     labels = F)
   # table(bucketID)
-  
+
   ### Right or Left side Padding Pad sequences to their bucket length with
   ### dictionnary 0-label
   word_vec_list_pad <- lapply(1:length(word_vec_list), function(x) {
     length(word_vec_list[[x]]) <- seq_len[bucketID[x]]
     word_vec_list[[x]][is.na(word_vec_list[[x]])] <- names(dic[1])
-    if (right_pad == F) 
+    if (right_pad == F)
       word_vec_list[[x]] <- rev(word_vec_list[[x]])
     return(word_vec_list[[x]])
   })
-  
+
   ### Assign sequences to buckets and unroll them in order to be reshaped into arrays
-  unrolled_arrays <- lapply(1:length(seq_len), function(x) unlist(word_vec_list_pad[bucketID == 
+  unrolled_arrays <- lapply(1:length(seq_len), function(x) unlist(word_vec_list_pad[bucketID ==
     x]))
-  
+
   ### Assign labels to their buckets
   bucketed_labels <- lapply(1:length(seq_len), function(x) labels[bucketID == x])
   names(bucketed_labels) <- as.character(seq_len)
-  
+
   ### Assign the dictionnary to each bucket terms
   unrolled_arrays_dic <- lapply(1:length(seq_len), function(x) dic[unrolled_arrays[[x]]])
-  
+
   # length(splitted_arrays_dic[[1]]) Reshape into arrays having each sequence into
   # a column
-  features_arrays <- lapply(1:length(seq_len), function(x) array(unrolled_arrays_dic[[x]], 
+  features_arrays <- lapply(1:length(seq_len), function(x) array(unrolled_arrays_dic[[x]],
     dim = c(seq_len[x], length(unrolled_arrays_dic[[x]])/seq_len[x])))
-  
-  features <- lapply(1:length(seq_len), function(x) features_arrays[[x]][1:seq_len[x], 
+
+  features <- lapply(1:length(seq_len), function(x) features_arrays[[x]][1:seq_len[x],
     ])
   names(features) <- as.character(seq_len)
-  
+
   ### Combine data and labels into buckets
-  buckets <- lapply(1:length(seq_len), function(x) c(list(data = features[[x]]), 
+  buckets <- lapply(1:length(seq_len), function(x) c(list(data = features[[x]]),
     list(label = bucketed_labels[[x]])))
   names(buckets) <- as.character(seq_len)
-  
+
   ### reverse dictionnary
   rev_dic <- names(dic)
   names(rev_dic) <- dic
-  
+
   return(list(buckets = buckets, dic = dic, rev_dic = rev_dic))
 }
 
 
-corpus_preprocessed_train <- text_pre_process(corpus = train_raw, count_threshold = 10, 
+corpus_preprocessed_train <- text_pre_process(corpus = train_raw, count_threshold = 10,
   dic = NULL)
 
 # length(corpus_preprocessed_train$dic)
@@ -152,15 +169,15 @@ corpus_preprocessed_train <- readRDS(file = "corpus_preprocessed_train_10.rds")
 corpus_preprocessed_test <- readRDS(file = "corpus_preprocessed_test_10.rds")
 
 
-corpus_bucketed_train <- make_bucket_data(word_vec_list = corpus_preprocessed_train$word_vec_list, 
-  labels = rep(0:1, each = 12500), dic = corpus_preprocessed_train$dic, seq_len = c(100, 
+corpus_bucketed_train <- make_bucket_data(word_vec_list = corpus_preprocessed_train$word_vec_list,
+  labels = rep(0:1, each = 12500), dic = corpus_preprocessed_train$dic, seq_len = c(100,
     200, 300, 500, 800), right_pad = F)
 
 # lapply(corpus_bucketed_train$buckets, function(x) length(x[[2]]))
 
 
-corpus_bucketed_test <- make_bucket_data(word_vec_list = corpus_preprocessed_test$word_vec_list, 
-  labels = rep(0:1, each = 12500), dic = corpus_preprocessed_test$dic, seq_len = c(100, 
+corpus_bucketed_test <- make_bucket_data(word_vec_list = corpus_preprocessed_test$word_vec_list,
+  labels = rep(0:1, each = 12500), dic = corpus_preprocessed_test$dic, seq_len = c(100,
     200, 300, 500, 800), right_pad = F)
 
 # lapply(corpus_bucketed_test$buckets, function(x) length(x[[2]]))
diff --git a/example/rnn/bucket_R/data_preprocessing_seq_to_one.R b/example/rnn/bucket_R/data_preprocessing_seq_to_one.R
index 11c0a0ce4a..a7d73f0acf 100644
--- a/example/rnn/bucket_R/data_preprocessing_seq_to_one.R
+++ b/example/rnn/bucket_R/data_preprocessing_seq_to_one.R
@@ -1,6 +1,23 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 # download the IMDB dataset
 if (!file.exists("data/aclImdb_v1.tar.gz")) {
-  download.file("http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
+  download.file("http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz",
                 "data/aclImdb_v1.tar.gz")
   untar("data/aclImdb_v1.tar.gz")
 }
@@ -40,7 +57,7 @@ test_raw <- c(negative_test_raw, positive_test_raw)
 text_pre_process <- function(corpus, count_threshold = 10, dic = NULL) {
   raw_vec <- corpus
   raw_vec <- stri_enc_toascii(str = raw_vec)
-  
+
   ### perform some preprocessing
   raw_vec <- str_replace_all(string = raw_vec, pattern = "[^[:print:]]", replacement = "")
   raw_vec <- str_to_lower(string = raw_vec)
@@ -48,12 +65,12 @@ text_pre_process <- function(corpus, count_threshold = 10, dic = NULL) {
   raw_vec <- str_replace_all(string = raw_vec, pattern = "\\bbr\\b", replacement = "")
   raw_vec <- str_replace_all(string = raw_vec, pattern = "\\s+", replacement = " ")
   raw_vec <- str_trim(string = raw_vec)
-  
+
   ### Split raw sequence vectors into lists of word vectors (one list element per
   ### sequence)
-  word_vec_list <- stri_split_boundaries(raw_vec, type = "word", skip_word_none = T, 
+  word_vec_list <- stri_split_boundaries(raw_vec, type = "word", skip_word_none = T,
     skip_word_number = F, simplify = F)
-  
+
   ### Build vocabulary
   if (is.null(dic)) {
     word_vec_unlist <- unlist(word_vec_list)
@@ -63,77 +80,77 @@ text_pre_process <- function(corpus, count_threshold = 10, dic = NULL) {
     stopwords <- c(letters, "an", "the", "br")
     word_keep <- setdiff(word_keep, stopwords)
   } else word_keep <- names(dic)[!dic == 0]
-  
+
   ### Clean the sentences to keep only the curated list of words
   word_vec_list <- lapply(word_vec_list, function(x) x[x %in% word_keep])
-  
+
   # sentence_vec<- stri_split_boundaries(raw_vec, type='sentence', simplify = T)
   word_vec_length <- lapply(word_vec_list, length) %>% unlist()
-  
+
   ### Build dictionnary
   dic <- 1:length(word_keep)
   names(dic) <- word_keep
   dic <- c(`?` = 0, dic)
-  
+
   ### reverse dictionnary
   rev_dic <- names(dic)
   names(rev_dic) <- dic
-  
+
   return(list(word_vec_list = word_vec_list, dic = dic, rev_dic = rev_dic))
 }
 
-################################################################ 
+################################################################
 make_bucket_data <- function(word_vec_list, labels, dic, seq_len = c(225), right_pad = T) {
   ### Trunc sequence to max bucket length
   word_vec_list <- lapply(word_vec_list, head, n = max(seq_len))
-  
+
   word_vec_length <- lapply(word_vec_list, length) %>% unlist()
-  bucketID <- cut(word_vec_length, breaks = c(0, seq_len, Inf), include.lowest = T, 
+  bucketID <- cut(word_vec_length, breaks = c(0, seq_len, Inf), include.lowest = T,
     labels = F)
-  
+
   ### Right or Left side Padding Pad sequences to their bucket length with
   ### dictionnary 0-label
   word_vec_list_pad <- lapply(1:length(word_vec_list), function(x) {
     length(word_vec_list[[x]]) <- seq_len[bucketID[x]]
     word_vec_list[[x]][is.na(word_vec_list[[x]])] <- names(dic[1])
-    if (right_pad == F) 
+    if (right_pad == F)
       word_vec_list[[x]] <- rev(word_vec_list[[x]])
     return(word_vec_list[[x]])
   })
-  
+
   ### Assign sequences to buckets and unroll them in order to be reshaped into arrays
-  unrolled_arrays <- lapply(1:length(seq_len), function(x) unlist(word_vec_list_pad[bucketID == 
+  unrolled_arrays <- lapply(1:length(seq_len), function(x) unlist(word_vec_list_pad[bucketID ==
     x]))
-  
+
   ### Assign labels to their buckets
   bucketed_labels <- lapply(1:length(seq_len), function(x) labels[bucketID == x])
   names(bucketed_labels) <- as.character(seq_len)
-  
+
   ### Assign the dictionnary to each bucket terms
   unrolled_arrays_dic <- lapply(1:length(seq_len), function(x) dic[unrolled_arrays[[x]]])
-  
+
   # Reshape into arrays having each sequence into a row
   features <- lapply(1:length(seq_len), function(x) {
-    t(array(unrolled_arrays_dic[[x]], 
+    t(array(unrolled_arrays_dic[[x]],
           dim = c(seq_len[x], length(unrolled_arrays_dic[[x]])/seq_len[x])))
   })
-  
+
   names(features) <- as.character(seq_len)
-  
+
   ### Combine data and labels into buckets
-  buckets <- lapply(1:length(seq_len), function(x) c(list(data = features[[x]]), 
+  buckets <- lapply(1:length(seq_len), function(x) c(list(data = features[[x]]),
     list(label = bucketed_labels[[x]])))
   names(buckets) <- as.character(seq_len)
-  
+
   ### reverse dictionnary
   rev_dic <- names(dic)
   names(rev_dic) <- dic
-  
+
   return(list(buckets = buckets, dic = dic, rev_dic = rev_dic))
 }
 
 
-corpus_preprocessed_train <- text_pre_process(corpus = train_raw, count_threshold = 10, 
+corpus_preprocessed_train <- text_pre_process(corpus = train_raw, count_threshold = 10,
   dic = NULL)
 
 corpus_preprocessed_test <- text_pre_process(corpus = test_raw, dic = corpus_preprocessed_train$dic)
@@ -143,16 +160,16 @@ quantile(seq_length_dist, 0:20/20)
 
 
 # Save bucketed corpus
-corpus_bucketed_train <- make_bucket_data(word_vec_list = corpus_preprocessed_train$word_vec_list, 
-                                          labels = rep(0:1, each = 12500), 
-                                          dic = corpus_preprocessed_train$dic, 
-                                          seq_len = c(100, 150, 250, 400, 600), 
+corpus_bucketed_train <- make_bucket_data(word_vec_list = corpus_preprocessed_train$word_vec_list,
+                                          labels = rep(0:1, each = 12500),
+                                          dic = corpus_preprocessed_train$dic,
+                                          seq_len = c(100, 150, 250, 400, 600),
                                           right_pad = T)
 
-corpus_bucketed_test <- make_bucket_data(word_vec_list = corpus_preprocessed_test$word_vec_list, 
-                                         labels = rep(0:1, each = 12500), 
-                                         dic = corpus_preprocessed_test$dic, 
-                                         seq_len = c(100, 150, 250, 400, 600), 
+corpus_bucketed_test <- make_bucket_data(word_vec_list = corpus_preprocessed_test$word_vec_list,
+                                         labels = rep(0:1, each = 12500),
+                                         dic = corpus_preprocessed_test$dic,
+                                         seq_len = c(100, 150, 250, 400, 600),
                                          right_pad = T)
 
 saveRDS(corpus_bucketed_train, file = "data/corpus_bucketed_train.rds")
@@ -160,16 +177,16 @@ saveRDS(corpus_bucketed_test, file = "data/corpus_bucketed_test.rds")
 
 
 # Save non bucketed corpus
-corpus_single_train <- make_bucket_data(word_vec_list = corpus_preprocessed_train$word_vec_list, 
-                                          labels = rep(0:1, each = 12500), 
-                                          dic = corpus_preprocessed_train$dic, 
-                                          seq_len = c(600), 
+corpus_single_train <- make_bucket_data(word_vec_list = corpus_preprocessed_train$word_vec_list,
+                                          labels = rep(0:1, each = 12500),
+                                          dic = corpus_preprocessed_train$dic,
+                                          seq_len = c(600),
                                           right_pad = T)
 
-corpus_single_test <- make_bucket_data(word_vec_list = corpus_preprocessed_test$word_vec_list, 
-                                         labels = rep(0:1, each = 12500), 
-                                         dic = corpus_preprocessed_test$dic, 
-                                         seq_len = c(600), 
+corpus_single_test <- make_bucket_data(word_vec_list = corpus_preprocessed_test$word_vec_list,
+                                         labels = rep(0:1, each = 12500),
+                                         dic = corpus_preprocessed_test$dic,
+                                         seq_len = c(600),
                                          right_pad = T)
 
 saveRDS(corpus_single_train, file = "data/corpus_single_train.rds")
diff --git a/example/rnn/bucket_R/gru.cell.R b/example/rnn/bucket_R/gru.cell.R
index 5932cdf17e..91f5917af5 100644
--- a/example/rnn/bucket_R/gru.cell.R
+++ b/example/rnn/bucket_R/gru.cell.R
@@ -1,54 +1,71 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 # GRU cell symbol
-gru.cell <- function(num.hidden, indata, prev.state, param, seqidx, layeridx, dropout = 0, 
+gru.cell <- function(num.hidden, indata, prev.state, param, seqidx, layeridx, dropout = 0,
   data_masking) {
-  i2h <- mx.symbol.FullyConnected(data = indata, weight = param$gates.i2h.weight, 
-    bias = param$gates.i2h.bias, num.hidden = num.hidden * 2, name = paste0("t", 
+  i2h <- mx.symbol.FullyConnected(data = indata, weight = param$gates.i2h.weight,
+    bias = param$gates.i2h.bias, num.hidden = num.hidden * 2, name = paste0("t",
       seqidx, ".l", layeridx, ".gates.i2h"))
-  
-  if (dropout > 0) 
+
+  if (dropout > 0)
     i2h <- mx.symbol.Dropout(data = i2h, p = dropout)
-  
+
   if (!is.null(prev.state)) {
-    h2h <- mx.symbol.FullyConnected(data = prev.state$h, weight = param$gates.h2h.weight, 
-      bias = param$gates.h2h.bias, num.hidden = num.hidden * 2, name = paste0("t", 
+    h2h <- mx.symbol.FullyConnected(data = prev.state$h, weight = param$gates.h2h.weight,
+      bias = param$gates.h2h.bias, num.hidden = num.hidden * 2, name = paste0("t",
         seqidx, ".l", layeridx, ".gates.h2h"))
     gates <- i2h + h2h
   } else {
     gates <- i2h
   }
-  
-  split.gates <- mx.symbol.split(gates, num.outputs = 2, axis = 1, squeeze.axis = F, 
+
+  split.gates <- mx.symbol.split(gates, num.outputs = 2, axis = 1, squeeze.axis = F,
     name = paste0("t", seqidx, ".l", layeridx, ".split"))
-  
+
   update.gate <- mx.symbol.Activation(split.gates[[1]], act.type = "sigmoid")
   reset.gate <- mx.symbol.Activation(split.gates[[2]], act.type = "sigmoid")
-  
-  htrans.i2h <- mx.symbol.FullyConnected(data = indata, weight = param$trans.i2h.weight, 
-    bias = param$trans.i2h.bias, num.hidden = num.hidden, name = paste0("t", 
+
+  htrans.i2h <- mx.symbol.FullyConnected(data = indata, weight = param$trans.i2h.weight,
+    bias = param$trans.i2h.bias, num.hidden = num.hidden, name = paste0("t",
       seqidx, ".l", layeridx, ".trans.i2h"))
-  
+
   if (is.null(prev.state)) {
     h.after.reset <- reset.gate * 0
   } else {
     h.after.reset <- prev.state$h * reset.gate
   }
-  
-  htrans.h2h <- mx.symbol.FullyConnected(data = h.after.reset, weight = param$trans.h2h.weight, 
-    bias = param$trans.h2h.bias, num.hidden = num.hidden, name = paste0("t", 
+
+  htrans.h2h <- mx.symbol.FullyConnected(data = h.after.reset, weight = param$trans.h2h.weight,
+    bias = param$trans.h2h.bias, num.hidden = num.hidden, name = paste0("t",
       seqidx, ".l", layeridx, ".trans.h2h"))
-  
+
   h.trans <- htrans.i2h + htrans.h2h
   h.trans.active <- mx.symbol.Activation(h.trans, act.type = "tanh")
-  
+
   if (is.null(prev.state)) {
     next.h <- update.gate * h.trans.active
   } else {
     next.h <- prev.state$h + update.gate * (h.trans.active - prev.state$h)
   }
-  
+
   ### Add a mask - using the mask_array approach
   data_mask_expand <- mx.symbol.Reshape(data = data_masking, shape = c(1, -2))
   next.h <- mx.symbol.broadcast_mul(lhs = next.h, rhs = data_mask_expand)
-  
+
   return(list(h = next.h))
 }
diff --git a/example/rnn/bucket_R/lstm.cell.R b/example/rnn/bucket_R/lstm.cell.R
index 3c7b0e456d..5f82ad8276 100644
--- a/example/rnn/bucket_R/lstm.cell.R
+++ b/example/rnn/bucket_R/lstm.cell.R
@@ -1,41 +1,58 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 # LSTM cell symbol
-lstm.cell <- function(num.hidden, indata, prev.state, param, seqidx, layeridx, dropout = 0, 
+lstm.cell <- function(num.hidden, indata, prev.state, param, seqidx, layeridx, dropout = 0,
   data_masking) {
-  i2h <- mx.symbol.FullyConnected(data = indata, weight = param$i2h.weight, bias = param$i2h.bias, 
+  i2h <- mx.symbol.FullyConnected(data = indata, weight = param$i2h.weight, bias = param$i2h.bias,
     num.hidden = num.hidden * 4, name = paste0("t", seqidx, ".l", layeridx, ".i2h"))
-  
-  if (dropout > 0) 
+
+  if (dropout > 0)
     i2h <- mx.symbol.Dropout(data = i2h, p = dropout)
-  
+
   if (!is.null(prev.state)) {
-    h2h <- mx.symbol.FullyConnected(data = prev.state$h, weight = param$h2h.weight, 
-      bias = param$h2h.bias, num.hidden = num.hidden * 4, name = paste0("t", 
+    h2h <- mx.symbol.FullyConnected(data = prev.state$h, weight = param$h2h.weight,
+      bias = param$h2h.bias, num.hidden = num.hidden * 4, name = paste0("t",
         seqidx, ".l", layeridx, ".h2h"))
     gates <- i2h + h2h
   } else {
     gates <- i2h
   }
-  
-  split.gates <- mx.symbol.split(gates, num.outputs = 4, axis = 1, squeeze.axis = F, 
+
+  split.gates <- mx.symbol.split(gates, num.outputs = 4, axis = 1, squeeze.axis = F,
     name = paste0("t", seqidx, ".l", layeridx, ".slice"))
-  
+
   in.gate <- mx.symbol.Activation(split.gates[[1]], act.type = "sigmoid")
   in.transform <- mx.symbol.Activation(split.gates[[2]], act.type = "tanh")
   forget.gate <- mx.symbol.Activation(split.gates[[3]], act.type = "sigmoid")
   out.gate <- mx.symbol.Activation(split.gates[[4]], act.type = "sigmoid")
-  
+
   if (is.null(prev.state)) {
     next.c <- in.gate * in.transform
   } else {
     next.c <- (forget.gate * prev.state$c) + (in.gate * in.transform)
   }
-  
+
   next.h <- out.gate * mx.symbol.Activation(next.c, act.type = "tanh")
-  
+
   ### Add a mask - using the mask_array approach
   data_mask_expand <- mx.symbol.Reshape(data = data_masking, shape = c(1, -2))
   next.c <- mx.symbol.broadcast_mul(lhs = next.c, rhs = data_mask_expand)
   next.h <- mx.symbol.broadcast_mul(lhs = next.h, rhs = data_mask_expand)
-  
+
   return(list(c = next.c, h = next.h))
 }
diff --git a/example/rnn/bucket_R/mx.io.bucket.iter.R b/example/rnn/bucket_R/mx.io.bucket.iter.R
index 61f87957ed..febed2178c 100644
--- a/example/rnn/bucket_R/mx.io.bucket.iter.R
+++ b/example/rnn/bucket_R/mx.io.bucket.iter.R
@@ -1,6 +1,23 @@
-BucketIter <- setRefClass("BucketIter", fields = c("buckets", "bucket.names", "batch.size", 
-  "data.mask.element", "shuffle", "bucket.plan", "bucketID", "epoch", "batch", 
-  "batch.per.epoch", "seed"), contains = "Rcpp_MXArrayDataIter", methods = list(initialize = function(buckets, 
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+BucketIter <- setRefClass("BucketIter", fields = c("buckets", "bucket.names", "batch.size",
+  "data.mask.element", "shuffle", "bucket.plan", "bucketID", "epoch", "batch",
+  "batch.per.epoch", "seed"), contains = "Rcpp_MXArrayDataIter", methods = list(initialize = function(buckets,
   batch.size, data.mask.element = 0, shuffle = FALSE, seed = 123) {
   .self$buckets <- buckets
   .self$bucket.names <- names(.self$buckets)
@@ -25,16 +42,16 @@ BucketIter <- setRefClass("BucketIter", fields = c("buckets", "bucket.names", "b
   .self$batch.per.epoch <- sum(batch_per_bucket)
   .self$epoch <- .self$epoch + 1
   .self$batch <- 0
-  
+
   if (.self$shuffle) {
     set.seed(.self$seed)
     bucket_plan_names <- sample(rep(names(batch_per_bucket), times = batch_per_bucket))
-    .self$bucket.plan <- ave(bucket_plan_names == bucket_plan_names, bucket_plan_names, 
+    .self$bucket.plan <- ave(bucket_plan_names == bucket_plan_names, bucket_plan_names,
       FUN = cumsum)
     names(.self$bucket.plan) <- bucket_plan_names
     ### Return first BucketID at reset for initialization of the model
     .self$bucketID <- .self$bucket.plan[1]
-    
+
     .self$buckets <- lapply(.self$buckets, function(x) {
       shuffle_id <- sample(ncol(x$data))
       if (length(dim(x$label)) == 0) {
@@ -45,7 +62,7 @@ BucketIter <- setRefClass("BucketIter", fields = c("buckets", "bucket.names", "b
     })
   } else {
     bucket_plan_names <- rep(names(batch_per_bucket), times = batch_per_bucket)
-    .self$bucket.plan <- ave(bucket_plan_names == bucket_plan_names, bucket_plan_names, 
+    .self$bucket.plan <- ave(bucket_plan_names == bucket_plan_names, bucket_plan_names,
       FUN = cumsum)
     names(.self$bucket.plan) <- bucket_plan_names
   }
@@ -70,12 +87,12 @@ BucketIter <- setRefClass("BucketIter", fields = c("buckets", "bucket.names", "b
   } else {
     label <- .self$buckets[[names(.self$bucketID)]]$label[, idx, drop = F]
   }
-  return(list(data = mx.nd.array(data), data.mask.array = mx.nd.array(data_mask_array), 
+  return(list(data = mx.nd.array(data), data.mask.array = mx.nd.array(data_mask_array),
     label = mx.nd.array(label)))
 }, finalize = function() {
 }))
 
-# 
+#
 #' Create Bucket Iter
 #'
 #' @param buckets The data array.
@@ -85,8 +102,8 @@ BucketIter <- setRefClass("BucketIter", fields = c("buckets", "bucket.names", "b
 #' @param seed The random seed
 #'
 #' @export
-mx.io.bucket.iter <- function(buckets, batch.size, data.mask.element = 0, shuffle = FALSE, 
+mx.io.bucket.iter <- function(buckets, batch.size, data.mask.element = 0, shuffle = FALSE,
   seed = 123) {
-  return(BucketIter$new(buckets = buckets, batch.size = batch.size, data.mask.element = data.mask.element, 
+  return(BucketIter$new(buckets = buckets, batch.size = batch.size, data.mask.element = data.mask.element,
     shuffle = shuffle, seed = seed))
 }
diff --git a/example/rnn/bucket_R/rnn.R b/example/rnn/bucket_R/rnn.R
index ea02b959a7..3485cd1c87 100644
--- a/example/rnn/bucket_R/rnn.R
+++ b/example/rnn/bucket_R/rnn.R
@@ -1,104 +1,121 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 library(mxnet)
 
 source("lstm.cell.R")
 source("gru.cell.R")
 
 # unrolled RNN network
-rnn.unroll <- function(num.rnn.layer, seq.len, input.size, num.embed, num.hidden, 
-  num.label, dropout = 0, ignore_label = 0, init.state = NULL, config, cell.type = "lstm", 
+rnn.unroll <- function(num.rnn.layer, seq.len, input.size, num.embed, num.hidden,
+  num.label, dropout = 0, ignore_label = 0, init.state = NULL, config, cell.type = "lstm",
   output_last_state = F) {
   embed.weight <- mx.symbol.Variable("embed.weight")
   cls.weight <- mx.symbol.Variable("cls.weight")
   cls.bias <- mx.symbol.Variable("cls.bias")
-  
+
   param.cells <- lapply(1:num.rnn.layer, function(i) {
     if (cell.type == "lstm") {
-      cell <- list(i2h.weight = mx.symbol.Variable(paste0("l", i, ".i2h.weight")), 
-        i2h.bias = mx.symbol.Variable(paste0("l", i, ".i2h.bias")), h2h.weight = mx.symbol.Variable(paste0("l", 
-          i, ".h2h.weight")), h2h.bias = mx.symbol.Variable(paste0("l", i, 
+      cell <- list(i2h.weight = mx.symbol.Variable(paste0("l", i, ".i2h.weight")),
+        i2h.bias = mx.symbol.Variable(paste0("l", i, ".i2h.bias")), h2h.weight = mx.symbol.Variable(paste0("l",
+          i, ".h2h.weight")), h2h.bias = mx.symbol.Variable(paste0("l", i,
           ".h2h.bias")))
     } else if (cell.type == "gru") {
-      cell <- list(gates.i2h.weight = mx.symbol.Variable(paste0("l", i, ".gates.i2h.weight")), 
-        gates.i2h.bias = mx.symbol.Variable(paste0("l", i, ".gates.i2h.bias")), 
-        gates.h2h.weight = mx.symbol.Variable(paste0("l", i, ".gates.h2h.weight")), 
-        gates.h2h.bias = mx.symbol.Variable(paste0("l", i, ".gates.h2h.bias")), 
-        trans.i2h.weight = mx.symbol.Variable(paste0("l", i, ".trans.i2h.weight")), 
-        trans.i2h.bias = mx.symbol.Variable(paste0("l", i, ".trans.i2h.bias")), 
-        trans.h2h.weight = mx.symbol.Variable(paste0("l", i, ".trans.h2h.weight")), 
+      cell <- list(gates.i2h.weight = mx.symbol.Variable(paste0("l", i, ".gates.i2h.weight")),
+        gates.i2h.bias = mx.symbol.Variable(paste0("l", i, ".gates.i2h.bias")),
+        gates.h2h.weight = mx.symbol.Variable(paste0("l", i, ".gates.h2h.weight")),
+        gates.h2h.bias = mx.symbol.Variable(paste0("l", i, ".gates.h2h.bias")),
+        trans.i2h.weight = mx.symbol.Variable(paste0("l", i, ".trans.i2h.weight")),
+        trans.i2h.bias = mx.symbol.Variable(paste0("l", i, ".trans.i2h.bias")),
+        trans.h2h.weight = mx.symbol.Variable(paste0("l", i, ".trans.h2h.weight")),
         trans.h2h.bias = mx.symbol.Variable(paste0("l", i, ".trans.h2h.bias")))
     }
     return(cell)
   })
-  
+
   # embeding layer
   label <- mx.symbol.Variable("label")
   data <- mx.symbol.Variable("data")
   data_mask_array <- mx.symbol.Variable("data.mask.array")
   data_mask_array <- mx.symbol.stop_gradient(data_mask_array, name = "data.mask.array")
-  
-  embed <- mx.symbol.Embedding(data = data, input_dim = input.size, weight = embed.weight, 
+
+  embed <- mx.symbol.Embedding(data = data, input_dim = input.size, weight = embed.weight,
     output_dim = num.embed, name = "embed")
-  
+
   wordvec <- mx.symbol.split(data = embed, axis = 1, num.outputs = seq.len, squeeze_axis = T)
-  data_mask_split <- mx.symbol.split(data = data_mask_array, axis = 1, num.outputs = seq.len, 
+  data_mask_split <- mx.symbol.split(data = data_mask_array, axis = 1, num.outputs = seq.len,
     squeeze_axis = T)
-  
+
   last.hidden <- list()
   last.states <- list()
   decode <- list()
   softmax <- list()
   fc <- list()
-  
+
   for (seqidx in 1:seq.len) {
     hidden <- wordvec[[seqidx]]
-    
+
     for (i in 1:num.rnn.layer) {
       if (seqidx == 1) {
         prev.state <- init.state[[i]]
       } else {
         prev.state <- last.states[[i]]
       }
-      
+
       if (cell.type == "lstm") {
         cell.symbol <- lstm.cell
       } else if (cell.type == "gru") {
         cell.symbol <- gru.cell
       }
-      
-      next.state <- cell.symbol(num.hidden = num.hidden, indata = hidden, prev.state = prev.state, 
-        param = param.cells[[i]], seqidx = seqidx, layeridx = i, dropout = dropout, 
+
+      next.state <- cell.symbol(num.hidden = num.hidden, indata = hidden, prev.state = prev.state,
+        param = param.cells[[i]], seqidx = seqidx, layeridx = i, dropout = dropout,
         data_masking = data_mask_split[[seqidx]])
       hidden <- next.state$h
       # if (dropout > 0) hidden <- mx.symbol.Dropout(data=hidden, p=dropout)
       last.states[[i]] <- next.state
     }
-    
+
     # Decoding
     if (config == "one-to-one") {
       last.hidden <- c(last.hidden, hidden)
     }
   }
-  
+
   if (config == "seq-to-one") {
-    fc <- mx.symbol.FullyConnected(data = hidden, weight = cls.weight, bias = cls.bias, 
+    fc <- mx.symbol.FullyConnected(data = hidden, weight = cls.weight, bias = cls.bias,
       num.hidden = num.label)
-    
+
     loss <- mx.symbol.SoftmaxOutput(data = fc, name = "sm", label = label, ignore_label = ignore_label)
-    
+
   } else if (config == "one-to-one") {
-    last.hidden_expand <- lapply(last.hidden, function(i) mx.symbol.expand_dims(i, 
+    last.hidden_expand <- lapply(last.hidden, function(i) mx.symbol.expand_dims(i,
       axis = 1))
     concat <- mx.symbol.concat(last.hidden_expand, num.args = seq.len, dim = 1)
     reshape <- mx.symbol.Reshape(concat, shape = c(num.hidden, -1))
-    
-    fc <- mx.symbol.FullyConnected(data = reshape, weight = cls.weight, bias = cls.bias, 
+
+    fc <- mx.symbol.FullyConnected(data = reshape, weight = cls.weight, bias = cls.bias,
       num.hidden = num.label)
-    
+
     label <- mx.symbol.reshape(data = label, shape = c(-1))
     loss <- mx.symbol.SoftmaxOutput(data = fc, name = "sm", label = label, ignore_label = ignore_label)
-    
+
   }
-  
+
   if (output_last_state) {
     group <- mx.symbol.Group(c(unlist(last.states), loss))
     return(group)
@@ -108,32 +125,32 @@ rnn.unroll <- function(num.rnn.layer, seq.len, input.size, num.embed, num.hidden
 }
 
 ########################################### mx.rnn.buckets
-mx.rnn.buckets <- function(train.data, eval.data = NULL, num.rnn.layer, num.hidden, 
-  num.embed, num.label, input.size, ctx = NULL, num.round = 1, initializer = mx.init.uniform(0.01), 
-  dropout = 0, config = "one-to-one", optimizer = "sgd", batch.end.callback = NULL, 
-  epoch.end.callback = NULL, begin.round = 1, metric = mx.metric.rmse, cell.type = "lstm", 
+mx.rnn.buckets <- function(train.data, eval.data = NULL, num.rnn.layer, num.hidden,
+  num.embed, num.label, input.size, ctx = NULL, num.round = 1, initializer = mx.init.uniform(0.01),
+  dropout = 0, config = "one-to-one", optimizer = "sgd", batch.end.callback = NULL,
+  epoch.end.callback = NULL, begin.round = 1, metric = mx.metric.rmse, cell.type = "lstm",
   kvstore = "local", verbose = FALSE) {
-  
+
   if (!train.data$iter.next()) {
     train.data$reset()
-    if (!train.data$iter.next()) 
+    if (!train.data$iter.next())
       stop("Empty train.data")
   }
-  
+
   if (!is.null(eval.data)) {
     if (!eval.data$iter.next()) {
       eval.data$reset()
-      if (!eval.data$iter.next()) 
+      if (!eval.data$iter.next())
         stop("Empty eval.data")
     }
   }
-  
-  if (is.null(ctx)) 
+
+  if (is.null(ctx))
     ctx <- mx.ctx.default()
   if (is.mx.context(ctx)) {
     ctx <- list(ctx)
   }
-  if (!is.list(ctx)) 
+  if (!is.list(ctx))
     stop("ctx must be mx.context or list of mx.context")
   if (is.character(optimizer)) {
     if (is.numeric(input.shape)) {
@@ -145,17 +162,17 @@ mx.rnn.buckets <- function(train.data, eval.data = NULL, num.rnn.layer, num.hidd
     }
     optimizer <- mx.opt.create(optimizer, rescale.grad = (1/batchsize), ...)
   }
-  
+
   # get unrolled lstm symbol
   sym_list <- sapply(train.data$bucket.names, function(x) {
-    rnn.unroll(num.rnn.layer = num.rnn.layer, num.hidden = num.hidden, seq.len = as.integer(x), 
-      input.size = input.size, num.embed = num.embed, num.label = num.label, 
+    rnn.unroll(num.rnn.layer = num.rnn.layer, num.hidden = num.hidden, seq.len = as.integer(x),
+      input.size = input.size, num.embed = num.embed, num.label = num.label,
       dropout = dropout, cell.type = cell.type, config = config)
   }, simplify = F, USE.NAMES = T)
-  
+
   # setup lstm model
   symbol <- sym_list[[names(train.data$bucketID)]]
-  
+
   arg.names <- symbol$arguments
   input.names <- c("data", "data.mask.array")
   input.shape <- sapply(input.names, function(n) {
@@ -165,21 +182,21 @@ mx.rnn.buckets <- function(train.data, eval.data = NULL, num.rnn.layer, num.hidd
   output.shape <- sapply(output.names, function(n) {
     dim(train.data$value()[[n]])
   }, simplify = FALSE)
-  
-  params <- mx.model.init.params(symbol, input.shape, output.shape, initializer, 
+
+  params <- mx.model.init.params(symbol, input.shape, output.shape, initializer,
     mx.cpu())
-  
-  kvstore <- mxnet:::mx.model.create.kvstore(kvstore, params$arg.params, length(ctx), 
+
+  kvstore <- mxnet:::mx.model.create.kvstore(kvstore, params$arg.params, length(ctx),
     verbose = verbose)
-  
+
   ### Execute training - rnn.model.R
-  model <- mx.model.train.rnn.buckets(sym_list = sym_list, input.shape = input.shape, 
-    output.shape = output.shape, arg.params = params$arg.params, aux.params = params$aux.params, 
-    optimizer = optimizer, train.data = train.data, eval.data = eval.data, verbose = verbose, 
-    begin.round = begin.round, end.round = num.round, metric = metric, ctx = ctx, 
-    batch.end.callback = batch.end.callback, epoch.end.callback = epoch.end.callback, 
+  model <- mx.model.train.rnn.buckets(sym_list = sym_list, input.shape = input.shape,
+    output.shape = output.shape, arg.params = params$arg.params, aux.params = params$aux.params,
+    optimizer = optimizer, train.data = train.data, eval.data = eval.data, verbose = verbose,
+    begin.round = begin.round, end.round = num.round, metric = metric, ctx = ctx,
+    batch.end.callback = batch.end.callback, epoch.end.callback = epoch.end.callback,
     kvstore = kvstore)
-  
+
   return(model)
 }
 
diff --git a/example/rnn/bucket_R/rnn.infer.R b/example/rnn/bucket_R/rnn.infer.R
index 41488aac89..a3b9f53395 100644
--- a/example/rnn/bucket_R/rnn.infer.R
+++ b/example/rnn/bucket_R/rnn.infer.R
@@ -1,8 +1,25 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 library(mxnet)
 
 source("rnn.R")
 
-mx.rnn.infer.buckets <- function(infer_iter, model, config, ctx = mx.cpu(), output_last_state = FALSE, 
+mx.rnn.infer.buckets <- function(infer_iter, model, config, ctx = mx.cpu(), output_last_state = FALSE,
   init.state = NULL, cell.type = "lstm") {
   ### Infer parameters from model
   if (cell.type == "lstm") {
@@ -12,65 +29,65 @@ mx.rnn.infer.buckets <- function(infer_iter, model, config, ctx = mx.cpu(), outp
     num.rnn.layer <- round((length(model$arg.params) - 3)/8)
     num.hidden <- dim(model$arg.params$l1.gates.h2h.weight)[1]
   }
-  
+
   input.size <- dim(model$arg.params$embed.weight)[2]
   num.embed <- dim(model$arg.params$embed.weight)[1]
   num.label <- dim(model$arg.params$cls.bias)
-  
+
   ### Initialise the iterator
   infer_iter$reset()
   infer_iter$iter.next()
   batch_size <- infer_iter$batch.size
-  
+
   # get unrolled lstm symbol
   sym_list <- sapply(infer_iter$bucket.names, function(x) {
-    rnn.unroll(num.rnn.layer = num.rnn.layer, num.hidden = num.hidden, seq.len = as.integer(x), 
-      input.size = input.size, num.embed = num.embed, num.label = num.label, 
-      config = config, dropout = 0, init.state = init.state, cell.type = cell.type, 
+    rnn.unroll(num.rnn.layer = num.rnn.layer, num.hidden = num.hidden, seq.len = as.integer(x),
+      input.size = input.size, num.embed = num.embed, num.label = num.label,
+      config = config, dropout = 0, init.state = init.state, cell.type = cell.type,
       output_last_state = output_last_state)
   }, simplify = F, USE.NAMES = T)
-  
+
   symbol <- sym_list[[names(infer_iter$bucketID)]]
-  
+
   input.shape <- lapply(infer_iter$value(), dim)
   input.shape <- input.shape[names(input.shape) %in% arguments(symbol)]
-  
+
   infer_shapes <- symbol$infer.shape(input.shape)
   arg.params <- model$arg.params
   aux.params <- model$aux.params
-  
+
   input.names <- names(input.shape)
   arg.names <- names(arg.params)
-  
+
   # Grad request
   grad_req <- rep("null", length(symbol$arguments))
-  
+
   # Arg array order
   update_names <- c(input.names, arg.names)
   arg_update_idx <- match(symbol$arguments, update_names)
-  
+
   # Initial input shapes - need to be adapted for multi-devices - divide highest
   # dimension by device nb
   s <- sapply(input.shape, function(shape) {
     mx.nd.zeros(shape = shape, ctx = mx.cpu())
   })
-  
-  train.execs <- mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(s, arg.params)[arg_update_idx], 
+
+  train.execs <- mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(s, arg.params)[arg_update_idx],
     aux.arrays = aux.params, ctx = ctx, grad.req = grad_req)
-  
+
   packer <- mxnet:::mx.nd.arraypacker()
   infer_iter$reset()
   while (infer_iter$iter.next()) {
     # Get input data slice
     dlist <- infer_iter$value()[input.names]
-    
+
     symbol <- sym_list[[names(infer_iter$bucketID)]]
-    
-    texec <- mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(dlist, train.execs$arg.arrays[arg.names])[arg_update_idx], 
+
+    texec <- mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(dlist, train.execs$arg.arrays[arg.names])[arg_update_idx],
       aux.arrays = train.execs$aux.arrays, ctx = ctx, grad.req = grad_req)
-    
+
     mx.exec.forward(texec, is.train = FALSE)
-    
+
     out.preds <- mx.nd.copyto(texec$ref.outputs[[1]], mx.cpu())
     packer$push(out.preds)
   }
diff --git a/example/rnn/bucket_R/rnn.train.R b/example/rnn/bucket_R/rnn.train.R
index b833b2b1d3..d587e97fde 100644
--- a/example/rnn/bucket_R/rnn.train.R
+++ b/example/rnn/bucket_R/rnn.train.R
@@ -1,44 +1,61 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 library(mxnet)
 
 source("rnn.R")
 
 # Internal function to do multiple device training on RNN
-mx.model.train.rnn.buckets <- function(ctx, sym_list, arg.params, aux.params, input.shape, 
-  output.shape, begin.round, end.round, optimizer, train.data, eval.data, metric, 
+mx.model.train.rnn.buckets <- function(ctx, sym_list, arg.params, aux.params, input.shape,
+  output.shape, begin.round, end.round, optimizer, train.data, eval.data, metric,
   epoch.end.callback, batch.end.callback, kvstore, verbose = TRUE) {
   symbol <- sym_list[[names(train.data$bucketID)]]
-  
+
   input.names <- names(input.shape)
   output.names <- names(output.shape)
   arg.names <- names(arg.params)
-  
+
   ndevice <- length(ctx)
-  if (verbose) 
+  if (verbose)
     message(paste0("Start training with ", ndevice, " devices"))
   input_slice <- mxnet:::mx.model.slice.shape(input.shape, ndevice)
   output_slice <- mxnet:::mx.model.slice.shape(output.shape, ndevice)
-  
-  
+
+
   # Grad request
   grad_req <- rep("write", length(symbol$arguments))
   # grad_null_idx <- match(c(input.names, output.names), symbol$arguments)
   grad_null_idx <- match(input.names, symbol$arguments)
   grad_req[grad_null_idx] <- "null"
-  
+
   # Arg array order
   update_names <- c(input.names, output.names, arg.names)
   arg_update_idx <- match(symbol$arguments, update_names)
-  
+
   train.execs <- lapply(1:ndevice, function(i) {
     s <- sapply(append(input_slice[[i]]$shape, output_slice[[i]]$shape), function(shape) {
       mx.nd.zeros(shape = shape, ctx = mx.cpu())
     })
-    mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(s, arg.params)[arg_update_idx], 
+    mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(s, arg.params)[arg_update_idx],
       aux.arrays = aux.params, ctx = mx.cpu(), grad.req = grad_req)
   })
-  
+
   # KVStore related stuffs
-  params.index <- as.integer(mxnet:::mx.util.filter.null(lapply(1:length(train.execs[[1]]$ref.grad.arrays), 
+  params.index <- as.integer(mxnet:::mx.util.filter.null(lapply(1:length(train.execs[[1]]$ref.grad.arrays),
     function(k) {
       if (!is.null(train.execs[[1]]$ref.grad.arrays[[k]])) k else NULL
     })))
@@ -51,11 +68,11 @@ mx.model.train.rnn.buckets <- function(ctx, sym_list, arg.params, aux.params, in
       mx.opt.get.updater(optimizer, train.execs[[i]]$ref.arg.arrays)
     })
   }
-  
+
   if (!is.null(kvstore)) {
     kvstore$init(params.index, train.execs[[1]]$ref.arg.arrays[params.index])
   }
-  
+
   for (iteration in begin.round:end.round) {
     nbatch <- 0
     if (!is.null(metric)) {
@@ -72,25 +89,25 @@ mx.model.train.rnn.buckets <- function(ctx, sym_list, arg.params, aux.params, in
         })
         return(ret)
       })
-      
+
       train.execs <- lapply(1:ndevice, function(i) {
         s <- slices[[i]]
-        mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(s, train.execs[[i]]$arg.arrays[arg.names])[arg_update_idx], 
+        mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(s, train.execs[[i]]$arg.arrays[arg.names])[arg_update_idx],
           aux.arrays = train.execs[[i]]$aux.arrays, ctx = ctx[[i]], grad.req = grad_req)
       })
-      
+
       for (texec in train.execs) {
         mx.exec.forward(texec, is.train = TRUE)
       }
-      
+
       out.preds <- lapply(train.execs, function(texec) {
         mx.nd.copyto(texec$ref.outputs[[1]], mx.cpu())
       })
-      
+
       for (texec in train.execs) {
         mx.exec.backward(texec)
       }
-      
+
       if (!is.null(kvstore)) {
         # push the gradient
         kvstore$push(params.index, lapply(train.execs, function(texec) {
@@ -116,29 +133,29 @@ mx.model.train.rnn.buckets <- function(ctx, sym_list, arg.params, aux.params, in
           mx.exec.update.arg.arrays(train.execs[[i]], arg.blocks[[i]], skip.null = TRUE)
         }
       }
-      
+
       # Update the evaluation metrics
       if (!is.null(metric)) {
         # train.metric <- metric$update(dlist$label, out.preds, train.metric)
         for (i in 1:ndevice) {
-          train.metric <- metric$update(slices[[i]][[length(slices[[i]])]], 
+          train.metric <- metric$update(slices[[i]][[length(slices[[i]])]],
           out.preds[[i]], train.metric)
         }
       }
-      
+
       nbatch <- nbatch + 1
-      
+
       if (!is.null(batch.end.callback)) {
         batch.end.callback(iteration, nbatch, environment())
       }
     }
-    
+
     if (!is.null(metric)) {
       result <- metric$get(train.metric)
-      if (verbose) 
+      if (verbose)
         message(paste0("[", iteration, "] Train-", result$name, "=", result$value))
     }
-    
+
     if (!is.null(eval.data)) {
       if (!is.null(metric)) {
         eval.metric <- metric$init()
@@ -155,35 +172,35 @@ mx.model.train.rnn.buckets <- function(ctx, sym_list, arg.params, aux.params, in
           })
           return(ret)
         })
-        
-        
+
+
         train.execs <- lapply(1:ndevice, function(i) {
           s <- slices[[i]]
-          mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(s, train.execs[[i]]$arg.arrays[arg.names])[arg_update_idx], 
+          mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(s, train.execs[[i]]$arg.arrays[arg.names])[arg_update_idx],
           aux.arrays = train.execs[[i]]$aux.arrays, ctx = ctx[[i]], grad.req = grad_req)
         })
-        
+
         for (texec in train.execs) {
           mx.exec.forward(texec, is.train = FALSE)
         }
-        
+
         # copy outputs to CPU
         out.preds <- lapply(train.execs, function(texec) {
           mx.nd.copyto(texec$ref.outputs[[1]], mx.cpu())
         })
-        
+
         if (!is.null(metric)) {
           for (i in 1:ndevice) {
-          eval.metric <- metric$update(slices[[i]][[length(slices[[i]])]], 
+          eval.metric <- metric$update(slices[[i]][[length(slices[[i]])]],
             out.preds[[i]], eval.metric)
           }
         }
       }
-      
+
       if (!is.null(metric)) {
         result <- metric$get(eval.metric)
         if (verbose) {
-          message(paste0("[", iteration, "] Validation-", result$name, "=", 
+          message(paste0("[", iteration, "] Validation-", result$name, "=",
           result$value))
         }
       }
@@ -192,12 +209,12 @@ mx.model.train.rnn.buckets <- function(ctx, sym_list, arg.params, aux.params, in
     }
     # get the model out
     model <- mxnet:::mx.model.extract.model(symbol, train.execs)
-    
+
     epoch_continue <- TRUE
     if (!is.null(epoch.end.callback)) {
       epoch_continue <- epoch.end.callback(iteration, 0, environment(), verbose = verbose)
     }
-    
+
     if (!epoch_continue) {
       break
     }
diff --git a/example/rnn/get_ptb_data.sh b/example/rnn/get_ptb_data.sh
index d2641cb32b..0a0c7051b0 100755
--- a/example/rnn/get_ptb_data.sh
+++ b/example/rnn/get_ptb_data.sh
@@ -17,6 +17,11 @@
 # specific language governing permissions and limitations
 # under the License.
 
+echo ""
+echo "NOTE: Please review the licensing of the datasets in this script before proceeding"
+echo "See https://catalog.ldc.upenn.edu/ldc99t42 for the licensing"
+echo "Once that is done, please uncomment the wget commands in this script"
+echo ""
 
 RNN_DIR=$(cd `dirname $0`; pwd)
 DATA_DIR="${RNN_DIR}/data/"
@@ -26,7 +31,7 @@ if [[ ! -d "${DATA_DIR}" ]]; then
   mkdir -p ${DATA_DIR}
 fi
 
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
diff --git a/example/rnn/old/get_ptb_data.sh b/example/rnn/old/get_ptb_data.sh
index d2641cb32b..0a0c7051b0 100755
--- a/example/rnn/old/get_ptb_data.sh
+++ b/example/rnn/old/get_ptb_data.sh
@@ -17,6 +17,11 @@
 # specific language governing permissions and limitations
 # under the License.
 
+echo ""
+echo "NOTE: Please review the licensing of the datasets in this script before proceeding"
+echo "See https://catalog.ldc.upenn.edu/ldc99t42 for the licensing"
+echo "Once that is done, please uncomment the wget commands in this script"
+echo ""
 
 RNN_DIR=$(cd `dirname $0`; pwd)
 DATA_DIR="${RNN_DIR}/data/"
@@ -26,7 +31,7 @@ if [[ ! -d "${DATA_DIR}" ]]; then
   mkdir -p ${DATA_DIR}
 fi
 
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
diff --git a/example/speech-demo/decode_mxnet.sh b/example/speech-demo/decode_mxnet.sh
index d300d0e91c..983b14c5e0 100755
--- a/example/speech-demo/decode_mxnet.sh
+++ b/example/speech-demo/decode_mxnet.sh
@@ -1,23 +1,5 @@
 #!/bin/bash
 
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-
-
 # Copyright 2012-2013 Karel Vesely, Daniel Povey
 # 	    2015 Yu Zhang
 # Apache 2.0
diff --git a/example/speech-demo/io_func/convert2kaldi.py b/example/speech-demo/io_func/convert2kaldi.py
index eac8ee695a..6ea7bc4be0 100644
--- a/example/speech-demo/io_func/convert2kaldi.py
+++ b/example/speech-demo/io_func/convert2kaldi.py
@@ -1,34 +1,6 @@
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
 
 # Copyright 2013    Yajie Miao    Carnegie Mellon University
 
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#  http://www.apache.org/licenses/LICENSE-2.0
-#
-# THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED
-# WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE,
-# MERCHANTABLITY OR NON-INFRINGEMENT.
-# See the Apache 2 License for the specific language governing permissions and
-# limitations under the License.
 
 import numpy as np
 import os
diff --git a/example/speech_recognition/flac_to_wav.sh b/example/speech_recognition/flac_to_wav.sh
old mode 100644
new mode 100755
diff --git a/example/ssd/tools/prepare_coco.sh b/example/ssd/tools/prepare_coco.sh
old mode 100644
new mode 100755
diff --git a/example/ssd/tools/prepare_pascal.sh b/example/ssd/tools/prepare_pascal.sh
old mode 100644
new mode 100755
diff --git a/include/mxnet/base.h b/include/mxnet/base.h
index 7c136a6470..84b2fea712 100644
--- a/include/mxnet/base.h
+++ b/include/mxnet/base.h
@@ -109,11 +109,11 @@
 #endif
 
 /*! \brief major version */
-#define MXNET_MAJOR 0
+#define MXNET_MAJOR 1
 /*! \brief minor version */
-#define MXNET_MINOR 12
+#define MXNET_MINOR 0
 /*! \brief patch version */
-#define MXNET_PATCH 1
+#define MXNET_PATCH 0
 /*! \brief mxnet version */
 #define MXNET_VERSION (MXNET_MAJOR*10000 + MXNET_MINOR*100 + MXNET_PATCH)
 /*! \brief helper for making version number */
diff --git a/matlab/+mxnet/model.m b/matlab/+mxnet/model.m
index af61091e9f..401029146f 100644
--- a/matlab/+mxnet/model.m
+++ b/matlab/+mxnet/model.m
@@ -1,3 +1,21 @@
+% Licensed to the Apache Software Foundation (ASF) under one
+% or more contributor license agreements.  See the NOTICE file
+% distributed with this work for additional information
+% regarding copyright ownership.  The ASF licenses this file
+% to you under the Apache License, Version 2.0 (the
+% "License"); you may not use this file except in compliance
+% with the License.  You may obtain a copy of the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing,
+% software distributed under the License is distributed on an
+% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+% KIND, either express or implied.  See the License for the
+% specific language governing permissions and limitations
+% under the License.
+%
+
 classdef model < handle
 %MODEL MXNet model, supports load and forward
 
diff --git a/matlab/+mxnet/private/callmxnet.m b/matlab/+mxnet/private/callmxnet.m
index 51f3f6f0c9..4cf8c1726e 100644
--- a/matlab/+mxnet/private/callmxnet.m
+++ b/matlab/+mxnet/private/callmxnet.m
@@ -1,3 +1,21 @@
+% Licensed to the Apache Software Foundation (ASF) under one
+% or more contributor license agreements.  See the NOTICE file
+% distributed with this work for additional information
+% regarding copyright ownership.  The ASF licenses this file
+% to you under the Apache License, Version 2.0 (the
+% "License"); you may not use this file except in compliance
+% with the License.  You may obtain a copy of the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing,
+% software distributed under the License is distributed on an
+% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+% KIND, either express or implied.  See the License for the
+% specific language governing permissions and limitations
+% under the License.
+%
+
 function callmxnet(func, varargin)
 %CALLMXNET call mxnet functions
 
@@ -7,7 +25,7 @@ function callmxnet(func, varargin)
   cd(mxnet_root);
   mxnet_root = pwd;
   cd(cur_pwd);
-  
+
   assert(exist([mxnet_root, '/lib/libmxnet.so'   ], 'file') == 2 || ...
          exist([mxnet_root, '/lib/libmxnet.dylib'], 'file') == 2 || ...
          exist([mxnet_root, '/lib/libmxnet.dll'  ], 'file') == 2, ...
diff --git a/matlab/+mxnet/private/parse_json.m b/matlab/+mxnet/private/parse_json.m
index 6aa0b4e5a0..fbbb616c7f 100644
--- a/matlab/+mxnet/private/parse_json.m
+++ b/matlab/+mxnet/private/parse_json.m
@@ -1,4 +1,22 @@
-function data = parse_json(fname,varargin)
+% Licensed to the Apache Software Foundation (ASF) under one
+% or more contributor license agreements.  See the NOTICE file
+% distributed with this work for additional information
+% regarding copyright ownership.  The ASF licenses this file
+% to you under the Apache License, Version 2.0 (the
+% "License"); you may not use this file except in compliance
+% with the License.  You may obtain a copy of the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing,
+% software distributed under the License is distributed on an
+% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+% KIND, either express or implied.  See the License for the
+% specific language governing permissions and limitations
+% under the License.
+%
+
+    function data = parse_json(fname,varargin)
 %PARSE_JSON parse a JSON (JavaScript Object Notation) file or string
 %
 % Based on jsonlab (https://github.com/fangq/jsonlab) created by Qianqian Fang. Jsonlab is lisonced under BSD or GPL v3.
diff --git a/matlab/demo.m b/matlab/demo.m
index a914175ef0..659b687e0b 100644
--- a/matlab/demo.m
+++ b/matlab/demo.m
@@ -1,3 +1,21 @@
+% Licensed to the Apache Software Foundation (ASF) under one
+% or more contributor license agreements.  See the NOTICE file
+% distributed with this work for additional information
+% regarding copyright ownership.  The ASF licenses this file
+% to you under the Apache License, Version 2.0 (the
+% "License"); you may not use this file except in compliance
+% with the License.  You may obtain a copy of the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing,
+% software distributed under the License is distributed on an
+% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+% KIND, either express or implied.  See the License for the
+% specific language governing permissions and limitations
+% under the License.
+%
+
 %% Assumes model symbol and parameters already downloaded using .sh script
 
 %% Load the model
diff --git a/matlab/tests/prepare_data.m b/matlab/tests/prepare_data.m
index 6d450cdd36..429cbc5c64 100644
--- a/matlab/tests/prepare_data.m
+++ b/matlab/tests/prepare_data.m
@@ -1,3 +1,21 @@
+% Licensed to the Apache Software Foundation (ASF) under one
+% or more contributor license agreements.  See the NOTICE file
+% distributed with this work for additional information
+% regarding copyright ownership.  The ASF licenses this file
+% to you under the Apache License, Version 2.0 (the
+% "License"); you may not use this file except in compliance
+% with the License.  You may obtain a copy of the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing,
+% software distributed under the License is distributed on an
+% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+% KIND, either express or implied.  See the License for the
+% specific language governing permissions and limitations
+% under the License.
+%
+
 %% download cifar10 dataset
 system('wget https://www.cs.toronto.edu/~kriz/cifar-10-matlab.tar.gz')
 system('tar -xzvf cifar-10-matlab.tar.gz')
diff --git a/matlab/tests/test_prediction.m b/matlab/tests/test_prediction.m
index fe7d7a68ec..ee73c2d21b 100644
--- a/matlab/tests/test_prediction.m
+++ b/matlab/tests/test_prediction.m
@@ -1,3 +1,21 @@
+% Licensed to the Apache Software Foundation (ASF) under one
+% or more contributor license agreements.  See the NOTICE file
+% distributed with this work for additional information
+% regarding copyright ownership.  The ASF licenses this file
+% to you under the Apache License, Version 2.0 (the
+% "License"); you may not use this file except in compliance
+% with the License.  You may obtain a copy of the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing,
+% software distributed under the License is distributed on an
+% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+% KIND, either express or implied.  See the License for the
+% specific language governing permissions and limitations
+% under the License.
+%
+
 %% prepare
 
 addpath('..')
diff --git a/nnvm b/nnvm
index 8d79cfd0b4..e4a138ab94 160000
--- a/nnvm
+++ b/nnvm
@@ -1 +1 @@
-Subproject commit 8d79cfd0b42fbe9f6ad75886d495065d5500b9dd
+Subproject commit e4a138ab947d682c83625840bbcd66f70feb4b14
diff --git a/perl-package/AI-MXNet/examples/get_ptb_data.sh b/perl-package/AI-MXNet/examples/get_ptb_data.sh
index d2641cb32b..0a0c7051b0 100755
--- a/perl-package/AI-MXNet/examples/get_ptb_data.sh
+++ b/perl-package/AI-MXNet/examples/get_ptb_data.sh
@@ -17,6 +17,11 @@
 # specific language governing permissions and limitations
 # under the License.
 
+echo ""
+echo "NOTE: Please review the licensing of the datasets in this script before proceeding"
+echo "See https://catalog.ldc.upenn.edu/ldc99t42 for the licensing"
+echo "Once that is done, please uncomment the wget commands in this script"
+echo ""
 
 RNN_DIR=$(cd `dirname $0`; pwd)
 DATA_DIR="${RNN_DIR}/data/"
@@ -26,7 +31,7 @@ if [[ ! -d "${DATA_DIR}" ]]; then
   mkdir -p ${DATA_DIR}
 fi
 
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
-wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.train.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.valid.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/ptb/ptb.test.txt;
+#wget -P ${DATA_DIR} https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tinyshakespeare/input.txt;
diff --git a/ps-lite b/ps-lite
index bdd4c67e9e..2ce8b9a256 160000
--- a/ps-lite
+++ b/ps-lite
@@ -1 +1 @@
-Subproject commit bdd4c67e9e34dc0b8350ce306b0caa737eb31c83
+Subproject commit 2ce8b9a256207947acfa2cb9b09ab74b8de74547
diff --git a/python/mxnet/kvstore.py b/python/mxnet/kvstore.py
index d068d06579..23eb454b5b 100644
--- a/python/mxnet/kvstore.py
+++ b/python/mxnet/kvstore.py
@@ -408,10 +408,13 @@ def set_gradient_compression(self, compression_params):
             Other keys in this dictionary are optional and specific to the type
             of gradient compression.
         """
-        ckeys, cvals = _ctype_dict(compression_params)
-        check_call(_LIB.MXKVStoreSetGradientCompression(self.handle,
-                                                        mx_uint(len(compression_params)),
-                                                        ckeys, cvals))
+        if ('device' in self.type) or ('dist' in self.type):
+            ckeys, cvals = _ctype_dict(compression_params)
+            check_call(_LIB.MXKVStoreSetGradientCompression(self.handle,
+                                                            mx_uint(len(compression_params)),
+                                                            ckeys, cvals))
+        else:
+            raise Exception('Gradient compression is not supported for this type of kvstore')
 
     def set_optimizer(self, optimizer):
         """ Registers an optimizer with the kvstore.
diff --git a/python/mxnet/libinfo.py b/python/mxnet/libinfo.py
index d4d100e12d..ce60606236 100644
--- a/python/mxnet/libinfo.py
+++ b/python/mxnet/libinfo.py
@@ -61,4 +61,4 @@ def find_lib_path():
 
 
 # current version
-__version__ = "0.12.1"
+__version__ = "1.0.0"
diff --git a/python/mxnet/ndarray/ndarray.py b/python/mxnet/ndarray/ndarray.py
index 91d0e03e3d..a45a6a8247 100644
--- a/python/mxnet/ndarray/ndarray.py
+++ b/python/mxnet/ndarray/ndarray.py
@@ -691,7 +691,7 @@ def _set_nd_basic_indexing(self, key, value):
                         value.copyto(self)
                 elif isinstance(value, numeric_types):
                     _internal._full(shape=shape, ctx=self.context,
-                                    dtype=self.dtype, value=value, out=self)
+                                    dtype=self.dtype, value=float(value), out=self)
                 elif isinstance(value, (np.ndarray, np.generic)):
                     if isinstance(value, np.generic) or value.shape != shape:
                         value = np.broadcast_to(value, shape)
diff --git a/python/mxnet/optimizer.py b/python/mxnet/optimizer.py
index 5eb4f05d6d..013455614f 100644
--- a/python/mxnet/optimizer.py
+++ b/python/mxnet/optimizer.py
@@ -793,9 +793,9 @@ def update(self, index, weight, grad, state):
             srt = op.sqrt(adjusted_add)
             div = _internal._scatter_elemwise_div(grad, srt)
             retained_weight = sparse.retain(weight, grad.indices)
-            to_add = sparse.elemwise_add(div, _internal._mul_scalar(retained_weight, wd))
+            to_add = sparse.elemwise_add(div, _internal._mul_scalar(retained_weight, float(wd)))
             assert len(to_add.indices) == grad_indices_count
-            weight[:] = sparse.elemwise_add(weight, _internal._mul_scalar(to_add, -lr))
+            weight[:] = sparse.elemwise_add(weight, _internal._mul_scalar(to_add, float(-lr)))
             state[:] = history
             assert state.stype == save_history_stype
             assert len(history_indices) == grad_indices_count
diff --git a/python/mxnet/symbol/symbol.py b/python/mxnet/symbol/symbol.py
index e2cf0ecb68..ce7776d948 100644
--- a/python/mxnet/symbol/symbol.py
+++ b/python/mxnet/symbol/symbol.py
@@ -2759,7 +2759,7 @@ def full(shape, val, dtype=None, **kwargs):
     """
     if dtype is None:
         dtype = _numpy.float32
-    return _internal._full(shape=shape, dtype=dtype, value=val, **kwargs)
+    return _internal._full(shape=shape, dtype=dtype, value=float(val), **kwargs)
 
 # pylint: disable=redefined-outer-name
 def arange(start, stop=None, step=1.0, repeat=1, name=None, dtype=None):
diff --git a/scala-package/assembly/linux-x86_64-cpu/pom.xml b/scala-package/assembly/linux-x86_64-cpu/pom.xml
index f15a7e315d..5f1b4a58e8 100644
--- a/scala-package/assembly/linux-x86_64-cpu/pom.xml
+++ b/scala-package/assembly/linux-x86_64-cpu/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-full-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -18,12 +38,12 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-core_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
     </dependency>
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>libmxnet-scala-linux-x86_64-cpu</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <type>so</type>
     </dependency>
   </dependencies>
diff --git a/scala-package/assembly/linux-x86_64-cpu/src/main/assembly/assembly.xml b/scala-package/assembly/linux-x86_64-cpu/src/main/assembly/assembly.xml
index 97e34c819d..f221a67f87 100644
--- a/scala-package/assembly/linux-x86_64-cpu/src/main/assembly/assembly.xml
+++ b/scala-package/assembly/linux-x86_64-cpu/src/main/assembly/assembly.xml
@@ -1,3 +1,23 @@
+<?xml version='1.0'?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <assembly>
   <id>full</id>
   <formats>
diff --git a/scala-package/assembly/linux-x86_64-gpu/pom.xml b/scala-package/assembly/linux-x86_64-gpu/pom.xml
index 81e4d1ec59..d67a703aa4 100644
--- a/scala-package/assembly/linux-x86_64-gpu/pom.xml
+++ b/scala-package/assembly/linux-x86_64-gpu/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-full-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -18,12 +38,12 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-core_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
     </dependency>
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>libmxnet-scala-linux-x86_64-gpu</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <type>so</type>
     </dependency>
   </dependencies>
diff --git a/scala-package/assembly/linux-x86_64-gpu/src/main/assembly/assembly.xml b/scala-package/assembly/linux-x86_64-gpu/src/main/assembly/assembly.xml
index ba5030c918..3ee716729a 100644
--- a/scala-package/assembly/linux-x86_64-gpu/src/main/assembly/assembly.xml
+++ b/scala-package/assembly/linux-x86_64-gpu/src/main/assembly/assembly.xml
@@ -1,3 +1,23 @@
+<?xml version='1.0'?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <assembly>
   <id>full</id>
   <formats>
diff --git a/scala-package/assembly/osx-x86_64-cpu/main/assembly/assembly.xml b/scala-package/assembly/osx-x86_64-cpu/main/assembly/assembly.xml
index fecafecad3..9f015eebd8 100644
--- a/scala-package/assembly/osx-x86_64-cpu/main/assembly/assembly.xml
+++ b/scala-package/assembly/osx-x86_64-cpu/main/assembly/assembly.xml
@@ -1,3 +1,23 @@
+<?xml version='1.0'?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <assembly>
   <id>full</id>
   <formats>
diff --git a/scala-package/assembly/osx-x86_64-cpu/pom.xml b/scala-package/assembly/osx-x86_64-cpu/pom.xml
index 5e6cb8c7f6..a0be325a68 100644
--- a/scala-package/assembly/osx-x86_64-cpu/pom.xml
+++ b/scala-package/assembly/osx-x86_64-cpu/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-full-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -18,12 +38,12 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-core_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
     </dependency>
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>libmxnet-scala-osx-x86_64-cpu</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <type>jnilib</type>
     </dependency>
   </dependencies>
diff --git a/scala-package/assembly/osx-x86_64-cpu/src/main/assembly/assembly.xml b/scala-package/assembly/osx-x86_64-cpu/src/main/assembly/assembly.xml
index 1abf81dd9c..56ba127b91 100644
--- a/scala-package/assembly/osx-x86_64-cpu/src/main/assembly/assembly.xml
+++ b/scala-package/assembly/osx-x86_64-cpu/src/main/assembly/assembly.xml
@@ -1,3 +1,23 @@
+<?xml version='1.0'?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <assembly>
   <id>full</id>
   <formats>
diff --git a/scala-package/assembly/pom.xml b/scala-package/assembly/pom.xml
index b27630db8e..1502686bdc 100644
--- a/scala-package/assembly/pom.xml
+++ b/scala-package/assembly/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
diff --git a/scala-package/core/pom.xml b/scala-package/core/pom.xml
index 361833685e..adccfae119 100644
--- a/scala-package/core/pom.xml
+++ b/scala-package/core/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -71,13 +91,13 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-init_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <scope>provided</scope>
     </dependency>
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-macros_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <scope>provided</scope>
     </dependency>
   </dependencies>
diff --git a/scala-package/core/src/test/resources/log4j.properties b/scala-package/core/src/test/resources/log4j.properties
index 7d7ca36b28..d82fd7ea4f 100644
--- a/scala-package/core/src/test/resources/log4j.properties
+++ b/scala-package/core/src/test/resources/log4j.properties
@@ -1,3 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 # for development debugging
 log4j.rootLogger = debug, stdout
 
diff --git a/scala-package/examples/pom.xml b/scala-package/examples/pom.xml
index 9ad10c9de7..f65caa16a7 100644
--- a/scala-package/examples/pom.xml
+++ b/scala-package/examples/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -118,7 +138,7 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-core_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <scope>provided</scope>
     </dependency>
     <dependency>
diff --git a/scala-package/examples/scripts/customop/run_customop.sh b/scala-package/examples/scripts/customop/run_customop.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/customop/run_customopwithrtc.sh b/scala-package/examples/scripts/customop/run_customopwithrtc.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/module/run_sequential_module.sh b/scala-package/examples/scripts/module/run_sequential_module.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/neuralstyle_end2end/run_test_end2end.sh b/scala-package/examples/scripts/neuralstyle_end2end/run_test_end2end.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/neuralstyle_end2end/run_train_end2end.sh b/scala-package/examples/scripts/neuralstyle_end2end/run_train_end2end.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/profiler/run_profiler_matmul.sh b/scala-package/examples/scripts/profiler/run_profiler_matmul.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/profiler/run_profiler_ndarray.sh b/scala-package/examples/scripts/profiler/run_profiler_ndarray.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/rnn/run_lstm_bucketing.sh b/scala-package/examples/scripts/rnn/run_lstm_bucketing.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/rnn/run_test_charrnn.sh b/scala-package/examples/scripts/rnn/run_test_charrnn.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/run_cnntextclassification.sh b/scala-package/examples/scripts/run_cnntextclassification.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/run_gan_mnist.sh b/scala-package/examples/scripts/run_gan_mnist.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/run_multitask.sh b/scala-package/examples/scripts/run_multitask.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/run_neuralstyle.sh b/scala-package/examples/scripts/run_neuralstyle.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/scripts/run_visualization.sh b/scala-package/examples/scripts/run_visualization.sh
old mode 100644
new mode 100755
diff --git a/scala-package/examples/src/main/resources/log4j.properties b/scala-package/examples/src/main/resources/log4j.properties
index cb92f4c525..ef523cb7bc 100644
--- a/scala-package/examples/src/main/resources/log4j.properties
+++ b/scala-package/examples/src/main/resources/log4j.properties
@@ -1,3 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 # for development debugging
 log4j.rootLogger = info, stdout
 
diff --git a/scala-package/init-native/linux-x86_64/pom.xml b/scala-package/init-native/linux-x86_64/pom.xml
index 983135d911..6a88789742 100644
--- a/scala-package/init-native/linux-x86_64/pom.xml
+++ b/scala-package/init-native/linux-x86_64/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-scala-init-native-parent</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -20,7 +40,7 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-init_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <type>jar</type>
       <scope>compile</scope>
     </dependency>
diff --git a/scala-package/init-native/osx-x86_64/pom.xml b/scala-package/init-native/osx-x86_64/pom.xml
index 2ca851baa4..7c0005224d 100644
--- a/scala-package/init-native/osx-x86_64/pom.xml
+++ b/scala-package/init-native/osx-x86_64/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-scala-init-native-parent</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -20,7 +40,7 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-init_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <type>jar</type>
       <scope>compile</scope>
     </dependency>
diff --git a/scala-package/init-native/pom.xml b/scala-package/init-native/pom.xml
index 4ae2426bbf..cb04a38888 100644
--- a/scala-package/init-native/pom.xml
+++ b/scala-package/init-native/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
diff --git a/scala-package/init/pom.xml b/scala-package/init/pom.xml
index eed3aee82a..31342335b1 100644
--- a/scala-package/init/pom.xml
+++ b/scala-package/init/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
 <!--  <relativePath>../pom.xml</relativePath>-->
   </parent>
 
diff --git a/scala-package/macros/pom.xml b/scala-package/macros/pom.xml
index 76f2438d38..e2ee632108 100644
--- a/scala-package/macros/pom.xml
+++ b/scala-package/macros/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -41,13 +61,13 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-init_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <scope>provided</scope>
     </dependency>
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>libmxnet-init-scala-${platform}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <scope>provided</scope>
       <type>${libtype}</type>
     </dependency>
diff --git a/scala-package/native/linux-x86_64-cpu/pom.xml b/scala-package/native/linux-x86_64-cpu/pom.xml
index 47194069e4..cbe124ad36 100644
--- a/scala-package/native/linux-x86_64-cpu/pom.xml
+++ b/scala-package/native/linux-x86_64-cpu/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-scala-native-parent</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -20,7 +40,7 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-core_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <type>jar</type>
       <scope>compile</scope>
     </dependency>
diff --git a/scala-package/native/linux-x86_64-gpu/pom.xml b/scala-package/native/linux-x86_64-gpu/pom.xml
index 5e038b2d5a..705d84e978 100644
--- a/scala-package/native/linux-x86_64-gpu/pom.xml
+++ b/scala-package/native/linux-x86_64-gpu/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-scala-native-parent</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -20,7 +40,7 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-core_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <type>jar</type>
       <scope>compile</scope>
     </dependency>
diff --git a/scala-package/native/osx-x86_64-cpu/pom.xml b/scala-package/native/osx-x86_64-cpu/pom.xml
index 227ad24f8f..b8e741abf8 100644
--- a/scala-package/native/osx-x86_64-cpu/pom.xml
+++ b/scala-package/native/osx-x86_64-cpu/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-scala-native-parent</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -20,7 +40,7 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-core_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <type>jar</type>
       <scope>compile</scope>
     </dependency>
diff --git a/scala-package/native/pom.xml b/scala-package/native/pom.xml
index ffb8740239..dfa23866f5 100644
--- a/scala-package/native/pom.xml
+++ b/scala-package/native/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
diff --git a/scala-package/pom.xml b/scala-package/pom.xml
index a91fbe4420..68eb598012 100644
--- a/scala-package/pom.xml
+++ b/scala-package/pom.xml
@@ -5,7 +5,7 @@
   <modelVersion>4.0.0</modelVersion>
   <groupId>ml.dmlc.mxnet</groupId>
   <artifactId>mxnet-parent_2.11</artifactId>
-  <version>0.12.1-SNAPSHOT</version>
+  <version>1.0.0-SNAPSHOT</version>
   <name>MXNet Scala Package - Parent</name>
   <url>https://github.com/dmlc/mxnet/tree/master/scala-package</url>
   <description>MXNet Scala Package</description>
diff --git a/scala-package/spark/pom.xml b/scala-package/spark/pom.xml
index 22114fb0a4..68e6f4e87b 100644
--- a/scala-package/spark/pom.xml
+++ b/scala-package/spark/pom.xml
@@ -1,4 +1,24 @@
 <?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
@@ -6,7 +26,7 @@
   <parent>
     <groupId>ml.dmlc.mxnet</groupId>
     <artifactId>mxnet-parent_2.11</artifactId>
-    <version>0.12.1-SNAPSHOT</version>
+    <version>1.0.0-SNAPSHOT</version>
     <relativePath>../pom.xml</relativePath>
   </parent>
 
@@ -21,7 +41,7 @@
     <dependency>
       <groupId>ml.dmlc.mxnet</groupId>
       <artifactId>mxnet-core_${scala.binary.version}</artifactId>
-      <version>0.12.1-SNAPSHOT</version>
+      <version>1.0.0-SNAPSHOT</version>
       <scope>provided</scope>
     </dependency>
     <dependency>
diff --git a/setup-utils/install-mxnet-amz-linux.sh b/setup-utils/install-mxnet-amz-linux.sh
old mode 100644
new mode 100755
diff --git a/setup-utils/install-mxnet-fedora-python.sh b/setup-utils/install-mxnet-fedora-python.sh
old mode 100644
new mode 100755
diff --git a/setup-utils/install-mxnet-osx-python.sh b/setup-utils/install-mxnet-osx-python.sh
index 25a44796cb..d6efa4083a 100755
--- a/setup-utils/install-mxnet-osx-python.sh
+++ b/setup-utils/install-mxnet-osx-python.sh
@@ -33,7 +33,7 @@ then
 	# TODO: Change this to latest tag
 	#       to avoid updating this value for every release
 	#
-	export MXNET_TAG="0.12.0"
+	export MXNET_TAG="1.0.0"
 fi
 
 export TARIKH=`/bin/date +%Y-%m-%d-%H:%M:%S`
diff --git a/setup-utils/install-mxnet-ubuntu-python.sh b/setup-utils/install-mxnet-ubuntu-python.sh
old mode 100644
new mode 100755
diff --git a/setup-utils/install-mxnet-ubuntu-r.sh b/setup-utils/install-mxnet-ubuntu-r.sh
old mode 100644
new mode 100755
diff --git a/snapcraft.yaml b/snapcraft.yaml
index de68a8077f..bbc8087a74 100644
--- a/snapcraft.yaml
+++ b/snapcraft.yaml
@@ -1,5 +1,5 @@
 name: mxnet
-version: '0.12.1'
+version: '1.0.0'
 summary: MXNet is a deep learning framework designed for efficiency and flexibility.
 description: |
   MXNet is a deep learning framework designed for both efficiency and 
diff --git a/src/common/rtc.cc b/src/common/rtc.cc
index cc51aaa108..c48afc6895 100644
--- a/src/common/rtc.cc
+++ b/src/common/rtc.cc
@@ -74,7 +74,7 @@ CudaModule::Chunk::~Chunk() {
 CUfunction CudaModule::Chunk::GetFunction(
     const std::string& mangled_name,
     const Context& ctx) {
-  CHECK_EQ(ctx.dev_mask(), gpu::kDevMask)
+  CHECK_EQ(ctx.dev_mask(), Context::kGPU)
       << "CUDA Runtime compilation only supports Nvidia GPU.";
   auto iter = mod_.find(ctx.dev_id);
   CUmodule module;
diff --git a/src/engine/threaded_engine_perdevice.cc b/src/engine/threaded_engine_perdevice.cc
index c01de75384..e7e222f6cb 100644
--- a/src/engine/threaded_engine_perdevice.cc
+++ b/src/engine/threaded_engine_perdevice.cc
@@ -55,7 +55,6 @@ class ThreadedEnginePerDevice : public ThreadedEngine {
 #ifndef _WIN32
     pthread_atfork(
       []() {
-        Engine::Get()->WaitForAll();
         Engine::Get()->Stop();
       },
       []() {
@@ -71,10 +70,10 @@ class ThreadedEnginePerDevice : public ThreadedEngine {
 #endif
   }
   ~ThreadedEnginePerDevice() noexcept(false) {
-    this->Stop();
+    this->StopNoWait();
   }
 
-  void Stop() override {
+  void StopNoWait() {
     SignalQueuesForKill();
     gpu_normal_workers_.Clear();
     gpu_copy_workers_.Clear();
@@ -82,16 +81,24 @@ class ThreadedEnginePerDevice : public ThreadedEngine {
     cpu_priority_worker_.reset(nullptr);
   }
 
+  void Stop() override {
+    if (is_worker_) return;
+    WaitForAll();
+    StopNoWait();
+  }
+
   void Start() override {
+    if (is_worker_) return;
     gpu_worker_nthreads_ = common::GetNumThreadPerGPU();
     cpu_worker_nthreads_ = dmlc::GetEnv("MXNET_CPU_WORKER_NTHREADS", 1);
     // create CPU task
     int cpu_priority_nthreads = dmlc::GetEnv("MXNET_CPU_PRIORITY_NTHREADS", 4);
     cpu_priority_worker_.reset(new ThreadWorkerBlock<kPriorityQueue>());
     cpu_priority_worker_->pool.reset(new ThreadPool(
-        cpu_priority_nthreads, [this]() {
-          this->CPUWorker(Context(), cpu_priority_worker_.get());
-        }));
+        cpu_priority_nthreads,
+        [this](std::shared_ptr<ThreadPool::SimpleEvent> ready_event) {
+          this->CPUWorker(Context(), cpu_priority_worker_.get(), ready_event);
+        }, true));
     // GPU tasks will be created lazily
   }
 
@@ -116,9 +123,10 @@ class ThreadedEnginePerDevice : public ThreadedEngine {
           auto ptr =
           cpu_normal_workers_.Get(dev_id, [this, ctx, nthread]() {
               auto blk = new ThreadWorkerBlock<kWorkerQueue>();
-              blk->pool.reset(new ThreadPool(nthread, [this, ctx, blk] () {
-                    this->CPUWorker(ctx, blk);
-                  }));
+              blk->pool.reset(new ThreadPool(nthread,
+                  [this, ctx, blk](std::shared_ptr<ThreadPool::SimpleEvent> ready_event) {
+                    this->CPUWorker(ctx, blk, ready_event);
+                  }, true));
               return blk;
             });
           if (ptr) {
@@ -196,6 +204,8 @@ class ThreadedEnginePerDevice : public ThreadedEngine {
     ~ThreadWorkerBlock() noexcept(false) {}
   };
 
+  /*! \brief whether this is a worker thread. */
+  static MX_THREAD_LOCAL bool is_worker_;
   /*! \brief number of concurrent thread cpu worker uses */
   int cpu_worker_nthreads_;
   /*! \brief number of concurrent thread each gpu worker uses */
@@ -219,6 +229,7 @@ class ThreadedEnginePerDevice : public ThreadedEngine {
                         bool is_copy_worker,
                         ThreadWorkerBlock<type> *block,
                         std::shared_ptr<ThreadPool::SimpleEvent> ready_event) {
+    this->is_worker_ = true;
 #if MXNET_USE_CUDA
     mshadow::Stream<gpu> *stream;
     do {
@@ -250,11 +261,14 @@ class ThreadedEnginePerDevice : public ThreadedEngine {
    */
   template<dmlc::ConcurrentQueueType type>
   inline void CPUWorker(Context ctx,
-                        ThreadWorkerBlock<type> *block) {
+                        ThreadWorkerBlock<type> *block,
+                        std::shared_ptr<ThreadPool::SimpleEvent> ready_event) {
+    this->is_worker_ = true;
     auto* task_queue = &(block->task_queue);
     RunContext run_ctx{ctx, nullptr};
     // execute task
     OprBlock* opr_block;
+    ready_event->signal();
     while (task_queue->Pop(&opr_block)) {
       this->ExecuteOprBlock(run_ctx, opr_block);
     }
@@ -303,5 +317,8 @@ class ThreadedEnginePerDevice : public ThreadedEngine {
 Engine *CreateThreadedEnginePerDevice() {
   return new ThreadedEnginePerDevice();
 }
+
+MX_THREAD_LOCAL bool ThreadedEnginePerDevice::is_worker_ = false;
+
 }  // namespace engine
 }  // namespace mxnet
diff --git a/src/nnvm/legacy_json_util.cc b/src/nnvm/legacy_json_util.cc
index bdd983cd3a..2ddd4a1989 100644
--- a/src/nnvm/legacy_json_util.cc
+++ b/src/nnvm/legacy_json_util.cc
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /*!
  *  Copyright (c) 2016 by Contributors
  * \file legacy_json_util.cc
diff --git a/src/nnvm/legacy_op_util.cc b/src/nnvm/legacy_op_util.cc
index e5d1d1c8de..6048d15496 100644
--- a/src/nnvm/legacy_op_util.cc
+++ b/src/nnvm/legacy_op_util.cc
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /*!
  *  Copyright (c) 2015 by Contributors
  * \file legacy_op_util.cc
diff --git a/src/operator/contrib/ctc_include/contrib/moderngpu/include/mgpuenums.h b/src/operator/contrib/ctc_include/contrib/moderngpu/include/mgpuenums.h
index be2b8314a8..601614b21a 100644
--- a/src/operator/contrib/ctc_include/contrib/moderngpu/include/mgpuenums.h
+++ b/src/operator/contrib/ctc_include/contrib/moderngpu/include/mgpuenums.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /******************************************************************************
  * Copyright (c) 2013, NVIDIA CORPORATION.  All rights reserved.
  * 
diff --git a/src/operator/contrib/ctc_include/contrib/moderngpu/include/util/static.h b/src/operator/contrib/ctc_include/contrib/moderngpu/include/util/static.h
index c720907750..016015503b 100644
--- a/src/operator/contrib/ctc_include/contrib/moderngpu/include/util/static.h
+++ b/src/operator/contrib/ctc_include/contrib/moderngpu/include/util/static.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /******************************************************************************
  * Copyright (c) 2013, NVIDIA CORPORATION.  All rights reserved.
  * 
diff --git a/src/operator/contrib/ctc_include/detail/cpu_ctc.h b/src/operator/contrib/ctc_include/detail/cpu_ctc.h
index ba8bbc558f..1509b6dda1 100644
--- a/src/operator/contrib/ctc_include/detail/cpu_ctc.h
+++ b/src/operator/contrib/ctc_include/detail/cpu_ctc.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #include <tuple>
diff --git a/src/operator/contrib/ctc_include/detail/ctc_helper.h b/src/operator/contrib/ctc_include/detail/ctc_helper.h
index 35b7a96014..6dae61aa3b 100644
--- a/src/operator/contrib/ctc_include/detail/ctc_helper.h
+++ b/src/operator/contrib/ctc_include/detail/ctc_helper.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #include <limits>
diff --git a/src/operator/contrib/ctc_include/detail/gpu_ctc.h b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
index c9cab80f34..b3c5d1f18a 100644
--- a/src/operator/contrib/ctc_include/detail/gpu_ctc.h
+++ b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 
diff --git a/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h b/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
index 7f53232f87..99d8e4dd26 100644
--- a/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
+++ b/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #include "../contrib/moderngpu/include/device/ctascan.cuh"
diff --git a/src/operator/contrib/ctc_include/detail/hostdevice.h b/src/operator/contrib/ctc_include/detail/hostdevice.h
index 7bec1e0017..4f68d0381a 100644
--- a/src/operator/contrib/ctc_include/detail/hostdevice.h
+++ b/src/operator/contrib/ctc_include/detail/hostdevice.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #ifdef __CUDACC__
diff --git a/src/operator/contrib/nn/deformable_im2col.h b/src/operator/contrib/nn/deformable_im2col.h
index 0d644e2604..1c25982ed3 100644
--- a/src/operator/contrib/nn/deformable_im2col.h
+++ b/src/operator/contrib/nn/deformable_im2col.h
@@ -1,22 +1,3 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
 /*!
  ******************* BEGIN Caffe Copyright Notice and Disclaimer ****************
  *
diff --git a/src/operator/contrib/psroi_pooling-inl.h b/src/operator/contrib/psroi_pooling-inl.h
index ff05304532..3a3a9c3492 100644
--- a/src/operator/contrib/psroi_pooling-inl.h
+++ b/src/operator/contrib/psroi_pooling-inl.h
@@ -1,22 +1,3 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
 /*!
  * Copyright (c) 2017 by Contributors
  * Copyright (c) 2017 Microsoft
diff --git a/src/operator/nn/pool.h b/src/operator/nn/pool.h
index 67412586c8..79accb5d52 100644
--- a/src/operator/nn/pool.h
+++ b/src/operator/nn/pool.h
@@ -1,22 +1,3 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
 /*!
  ******************* BEGIN Caffe Copyright Notice and Disclaimer ****************
  *
diff --git a/src/operator/operator_tune.h b/src/operator/operator_tune.h
index 4f92c9d3cb..b343e83a02 100644
--- a/src/operator/operator_tune.h
+++ b/src/operator/operator_tune.h
@@ -56,7 +56,7 @@ class OperatorTuneBase {
    * \return Tick object representing the current itmestamp
    */
   static MSHADOW_CINLINE Tick Now() {
-    return std::move(std::chrono::high_resolution_clock::now());
+    return std::chrono::high_resolution_clock::now();
   }
 
   /*!
@@ -154,7 +154,7 @@ class OperatorTuneByType : public OperatorTuneBase {
    * \brief Get the current tuning mode
    * \return tune::TuningMode value for the current tuning mode
    */
-  static MSHADOW_CINLINE volatile tune::TuningMode tuning_mode() {
+  static MSHADOW_CINLINE tune::TuningMode tuning_mode() {
     return tuning_mode_;
   }
 
diff --git a/src/operator/special_functions-inl.h b/src/operator/special_functions-inl.h
index f51cfeec9f..743391e0fc 100644
--- a/src/operator/special_functions-inl.h
+++ b/src/operator/special_functions-inl.h
@@ -1,22 +1,3 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
 /*!
  * Copyright (c) 2015 by Contributors
  * \file special_functions-inl.h
diff --git a/src/storage/cpu_shared_storage_manager.h b/src/storage/cpu_shared_storage_manager.h
index d623cf2c7b..9f0f2a354d 100644
--- a/src/storage/cpu_shared_storage_manager.h
+++ b/src/storage/cpu_shared_storage_manager.h
@@ -139,6 +139,7 @@ void CPUSharedStorageManager::Alloc(Storage::Handle* handle) {
   ptr = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fid, 0);
   CHECK_NE(ptr, MAP_FAILED)
       << "Failed to map shared memory. mmap failed with error " << strerror(errno);
+  close(fid);
 #endif  // _WIN32
 
   if (is_new) {
diff --git a/tests/ci_build/install/install_julia.sh b/tests/ci_build/install/install_julia.sh
old mode 100644
new mode 100755
diff --git a/tests/ci_build/install/install_library.sh b/tests/ci_build/install/install_library.sh
old mode 100644
new mode 100755
diff --git a/tests/ci_build/install/install_maven.sh b/tests/ci_build/install/install_maven.sh
old mode 100644
new mode 100755
diff --git a/tests/ci_build/install/install_openblas.sh b/tests/ci_build/install/install_openblas.sh
old mode 100644
new mode 100755
diff --git a/tests/ci_build/install/install_opencv.sh b/tests/ci_build/install/install_opencv.sh
old mode 100644
new mode 100755
diff --git a/tests/ci_build/install/install_python2.sh b/tests/ci_build/install/install_python2.sh
old mode 100644
new mode 100755
diff --git a/tests/ci_build/install/install_python3.sh b/tests/ci_build/install/install_python3.sh
old mode 100644
new mode 100755
diff --git a/tests/ci_build/install/install_testdeps.sh b/tests/ci_build/install/install_testdeps.sh
old mode 100644
new mode 100755
diff --git a/tests/jenkins/set_user_permissions.sh b/tests/jenkins/set_user_permissions.sh
old mode 100644
new mode 100755
diff --git a/tests/nightly/compilation_warnings/compilation_warnings.sh b/tests/nightly/compilation_warnings/compilation_warnings.sh
old mode 100644
new mode 100755
diff --git a/tests/nightly/download.sh b/tests/nightly/download.sh
old mode 100644
new mode 100755
diff --git a/tests/nightly/sh2ju.sh b/tests/nightly/sh2ju.sh
old mode 100644
new mode 100755
diff --git a/tests/python/unittest/test_ndarray.py b/tests/python/unittest/test_ndarray.py
index 8e1f68fd62..5512b07c77 100644
--- a/tests/python/unittest/test_ndarray.py
+++ b/tests/python/unittest/test_ndarray.py
@@ -926,6 +926,16 @@ def test_getitem_autograd(np_array, index):
         test_getitem_autograd(np_array, index[0])
 
 
+def test_assign_float_value_to_ndarray():
+    """Test case from https://github.com/apache/incubator-mxnet/issues/8668"""
+    a = np.array([47.844944], dtype=np.float32)
+    b = mx.nd.zeros(1, dtype=np.float32)
+    b[0] = a
+    assert same(a, b.asnumpy())
+    b[0] = a[0]
+    assert same(a, b.asnumpy())
+
+
 if __name__ == '__main__':
     import nose
     nose.runmodule()
diff --git a/tests/travis/r_vignettes.R b/tests/travis/r_vignettes.R
index 1aa9a4f755..1b03b8bba4 100644
--- a/tests/travis/r_vignettes.R
+++ b/tests/travis/r_vignettes.R
@@ -1,3 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 fnames <- list.files("R-package/vignettes/", pattern="*.Rmd")
 sapply(fnames, function(x){
 	knitr::purl(paste0("R-package/vignettes/", x))
diff --git a/tools/license_header.py b/tools/license_header.py
index e26fd2beca..7903279d69 100644
--- a/tools/license_header.py
+++ b/tools/license_header.py
@@ -63,7 +63,16 @@
                'nnvm',
                'ps-lite',
                'src/operator/mkl/',
-               'src/operator/contrib/ctc_include/']
+               'cmake/Modules/FindJeMalloc.cmake',
+               'src/operator/special_functions-inl.h',
+               'src/operator/nn/pool.h',
+               'src/operator/contrib/psroi_pooling-inl.h',
+               'src/operator/contrib/nn/deformable_im2col.h',
+               'example/speech-demo/io_func/convert2kaldi.py',
+               'example/speech-demo/decode_mxnet.sh',
+               'example/image-classification/predict-cpp/image-classification-predict.cc',
+               'src/operator/contrib/ctc_include/',
+               'cmake/Modules/FindJeMalloc.cmake']
 
 # language extensions and the according commment mark
 _LANGS = {'.cc':'*', '.h':'*', '.cu':'*', '.cuh':'*', '.py':'#',


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services