You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/08/24 22:30:05 UTC
[GitHub] szha closed pull request #12342: update release news

szha closed pull request #12342: update release news
URL: https://github.com/apache/incubator-mxnet/pull/12342
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/NEWS.md b/NEWS.md
index c0d54973603..925c63a3c78 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,6 +1,23 @@
 MXNet Change Log
 ================
 ## 1.3.0
+
+### New Features - Gluon RNN layers are now HybridBlocks
+- In this release, Gluon RNN layers such as `gluon.rnn.RNN`, `gluon.rnn.LSTM`, `gluon.rnn.GRU` becomes `HybridBlock`s as part of [gluon.rnn improvements project](https://github.com/apache/incubator-mxnet/projects/11) (#11482).
+- This is the result of newly available fused RNN operators added for CPU: LSTM([#10104](https://github.com/apache/incubator-mxnet/pull/10104)), vanilla RNN([#10104](https://github.com/apache/incubator-mxnet/pull/10104)), GRU([#10311](https://github.com/apache/incubator-mxnet/pull/10311))
+- Now many dynamic networks that are based on Gluon RNN layers can now be completely hybridized, exported, and used in the inference APIs in other language bindings such as R, Scala, etc.
+
+### MKL-DNN improvements
+- Introducing more functionality support for MKL-DNN as follows:
+  - Added support for more activation functions like, "sigmoid", "tanh", "softrelu". ([#10336](https://github.com/apache/incubator-mxnet/pull/10336))
+  - Added Debugging functionality: Result check ([#12069](https://github.com/apache/incubator-mxnet/pull/12069)) and Backend switch ([#12058](https://github.com/apache/incubator-mxnet/pull/12058)).
+
+### New Features - Gluon Model Zoo Pre-trained Models
+- Gluon Vision Model Zoo now provides MobileNetV2 pre-trained models (#10879) in addition to
+  AlexNet, DenseNet, Inception V3, MobileNetV1, ResNet V1 and V2, SqueezeNet 1.0 and 1.1, and VGG
+  pretrained models.
+- Updated pre-trained models provide state-of-the-art performance on all resnetv1, resnetv2, and vgg16, vgg19, vgg16_bn, vgg19_bn models (#11327 #11860 #11830).
+
 ### New Features - Scala API Improvements
 - Improvements to MXNet Scala API usability([#10660](https://github.com/apache/incubator-mxnet/pull/10660), [#10787](https://github.com/apache/incubator-mxnet/pull/10787), [#10991](https://github.com/apache/incubator-mxnet/pull/10991))
 - Symbol.api and NDArray.api would bring new set of functions that have complete definition for all arguments.
@@ -8,55 +25,57 @@ MXNet Change Log
 
 ### New Features - Scala examples
 - Refurnished Scala Examples with improved API, documentation and CI test coverage. ([#11753](https://github.com/apache/incubator-mxnet/pull/11753), [#11621](https://github.com/apache/incubator-mxnet/pull/11621) )
-- Now all Scala examples have: 
+- Now all Scala examples have:
   - No bugs block in the middle
   - Good Readme to start with
   - with Type-safe API usage inside
   - monitored in CI in each PR runs
 
-### New Features - Export MXNet models to ONNX format
-- With this feature, now MXNet models can be exported to ONNX format([#11213](https://github.com/apache/incubator-mxnet/pull/11213)). Currently, MXNet supports ONNX v1.2.1. [API documentation](http://mxnet.incubator.apache.org/api/python/contrib/onnx.html).
-- Checkout this [tutorial](http://mxnet.incubator.apache.org/tutorials/onnx/export_mxnet_to_onnx.html) which shows how to use MXNet to ONNX exporter APIs. ONNX protobuf so that those models can be imported in other frameworks for inference.
-
-### New Features - Topology-aware AllReduce (experimental)
-- This features uses trees to perform the Reduce and Broadcast. It uses the idea of minimum spanning trees to do a binary tree Reduce communication pattern to improve it. This topology aware approach reduces the existing limitations for single machine communication shown by mehods like parameter server and NCCL ring reduction. It is an experimental feature ([#11591](https://github.com/apache/incubator-mxnet/pull/11591)).
-- Paper followed for implementation: [Optimal message scheduling for aggregation](https://www.sysml.cc/doc/178.pdf).
-
 ### New Features - Clojure package (experimental)
 - MXNet now supports the Clojure programming language. The MXNet Clojure package brings flexible and efficient GPU computing and state-of-art deep learning to Clojure. It enables you to write seamless tensor/matrix computation with multiple GPUs in Clojure. It also lets you construct and customize the state-of-art deep learning models in Clojure, and apply them to tasks, such as image classification and data science challenges.([#11205](https://github.com/apache/incubator-mxnet/pull/11205))
 - Checkout examples and API documentation [here](http://mxnet.incubator.apache.org/api/clojure/index.html).
 
-### New Features - TensorRT Runtime Integration (experimental)
-- [TensorRT](https://developer.nvidia.com/tensorrt) provides significant acceleration of model inference on NVIDIA GPUs compared to running the full graph in MxNet using unfused GPU operators. In addition to faster fp32 inference, TensorRT optimizes fp16 inference, and is capable of int8 inference (provided the quantization steps are performed). Besides increasing throughput, TensorRT significantly reduces inference latency, especially for small batches.
-- This feature in MXNet now introduces runtime integration of TensorRT into MXNet, in order to accelerate inference.([#11325](https://github.com/apache/incubator-mxnet/pull/11325))
-- Currently, its in contrib package.
+### New Features - Synchronized Cross-GPU Batch Norm (experimental)
+- Gluon now supports Synchronized Batch Normalization (#11502).
+- This enables stable training on large-scale networks with high memory consumption such as FCN for image segmentation.
 
 ### New Features - Sparse Tensor Support for Gluon (experimental)
 - Sparse gradient support is added to nn.Embedding. ([#10924](https://github.com/apache/incubator-mxnet/pull/10924))
 - Gluon Parameter now supports "row_sparse" stype, which speeds up multi-GPU training ([#11001](https://github.com/apache/incubator-mxnet/pull/11001), [#11429](https://github.com/apache/incubator-mxnet/pull/11429))
 - Gluon HybridBlock now supports hybridization with sparse operators ([#11306](https://github.com/apache/incubator-mxnet/pull/11306)).
 
-### New Features - Fused RNN Operators for CPU
-- MXNet already provides fused RNN operators for users to run on GPU with CuDNN interfaces. But there was no support to use these operators on CPU.
-- Now with this release, MXNet provides these operators for CPUs too!
-- Fused RNN operators added for CPU: LSTM([#10104](https://github.com/apache/incubator-mxnet/pull/10104)), vanilla RNN([#10104](https://github.com/apache/incubator-mxnet/pull/10104)), GRU([#10311](https://github.com/apache/incubator-mxnet/pull/10311))
-
-### New Features - Control flow operators
-- This is the first step towards optimizing dynamic neural networks by adding symbolic and imperative control flow operators. [Proposal](https://cwiki.apache.org/confluence/display/MXNET/Optimize+dynamic+neural+network+models+with+control+flow+operators).
+### New Features - Control flow operators (experimental)
+- This is the first step towards optimizing dynamic neural networks with variable computation graphs, by adding symbolic and imperative control flow operators. [Proposal](https://cwiki.apache.org/confluence/display/MXNET/Optimize+dynamic+neural+network+models+with+control+flow+operators).
 - New operators introduced: foreach([#11531](https://github.com/apache/incubator-mxnet/pull/11531)), while_loop([#11566](https://github.com/apache/incubator-mxnet/pull/11566)), cond([#11760](https://github.com/apache/incubator-mxnet/pull/11760)).
 
-### New Features - MXNet Model Backwards Compatibility Checker
-- This tool ([#11626](https://github.com/apache/incubator-mxnet/pull/11626)) helps in ensuring consistency and sanity while performing inference on the latest version of MXNet using models trained on older versions of MXNet. 
-- This tool will help in detecting issues earlier in the development cycle which break backwards compatibility on MXNet and would contribute towards ensuring a healthy and stable release of MXNet.
+### New Features - Rounding GPU Memory Pool for dynamic networks with variable-length input-output (experimental)
+- MXNet now supports a new memory pool type for GPU memory (#11041).
+- Unlike the default memory pool requires exact size match to reuse released memory chunks, this new memory pool uses exponential-linear rounding so that similar sized memory chunks can all be reused, which is more suitable for all the workloads with dynamic-shape inputs and outputs. Set environment variable `MXNET_GPU_MEM_POOL_TYPE=Round` to enable.
 
-### MKL-DNN improvements
-- Introducing more functionality support for MKL-DNN as follows:
-  - Added support for more activation functions like, "sigmoid", "tanh", "softrelu". ([#10336](https://github.com/apache/incubator-mxnet/pull/10336))
-  - Added Debugging functionality: Result check ([#12069](https://github.com/apache/incubator-mxnet/pull/12069)) and Backend switch ([#12058](https://github.com/apache/incubator-mxnet/pull/12058)).
+### New Features - Topology-aware AllReduce (experimental)
+- This features uses trees to perform the Reduce and Broadcast. It uses the idea of minimum spanning trees to do a binary tree Reduce communication pattern to improve it. This topology aware approach reduces the existing limitations for single machine communication shown by mehods like parameter server and NCCL ring reduction. It is an experimental feature ([#11591](https://github.com/apache/incubator-mxnet/pull/11591)).
+- Paper followed for implementation: [Optimal message scheduling for aggregation](https://www.sysml.cc/doc/178.pdf).
+
+### New Features - Export MXNet models to ONNX format (experimental)
+- With this feature, now MXNet models can be exported to ONNX format([#11213](https://github.com/apache/incubator-mxnet/pull/11213)). Currently, MXNet supports ONNX v1.2.1. [API documentation](http://mxnet.incubator.apache.org/api/python/contrib/onnx.html).
+- Checkout this [tutorial](http://mxnet.incubator.apache.org/tutorials/onnx/export_mxnet_to_onnx.html) which shows how to use MXNet to ONNX exporter APIs. ONNX protobuf so that those models can be imported in other frameworks for inference.
+
+### New Features - TensorRT Runtime Integration (experimental)
+- [TensorRT](https://developer.nvidia.com/tensorrt) provides significant acceleration of model inference on NVIDIA GPUs compared to running the full graph in MxNet using unfused GPU operators. In addition to faster fp32 inference, TensorRT optimizes fp16 inference, and is capable of int8 inference (provided the quantization steps are performed). Besides increasing throughput, TensorRT significantly reduces inference latency, especially for small batches.
+- This feature in MXNet now introduces runtime integration of TensorRT into MXNet, in order to accelerate inference.([#11325](https://github.com/apache/incubator-mxnet/pull/11325))
+- Currently, its in contrib package.
 
-### Flaky Tests improvement effort
+### Maintenance - Flaky Tests improvement effort
+- Fixed 130 flaky tests on CI. Tracked progress of the project [here](https://github.com/apache/incubator-mxnet/projects/9).
 - Add flakiness checker (#11572)
-- Fixed 130 flaky tests on CI. Tracked progress of the project [here](https://github.com/apache/incubator-mxnet/projects/9). 
+
+### Maintenance - MXNet Model Backwards Compatibility Checker
+- This tool ([#11626](https://github.com/apache/incubator-mxnet/pull/11626)) helps in ensuring consistency and sanity while performing inference on the latest version of MXNet using models trained on older versions of MXNet.
+- This tool will help in detecting issues earlier in the development cycle which break backwards compatibility on MXNet and would contribute towards ensuring a healthy and stable release of MXNet.
+
+### Maintenance - Integrated testing for "the Straight Dope"
+- ["Deep Learning - The Straight Dope"](http://gluon.mxnet.io) is a deep learning book based on Apache MXNet Gluon that are contributed by many Gluon users.
+- Now the testing of this book is integrated in the nightly tests.
 
 ### Bug-fixes
 - Fix gperftools/jemalloc and lapack warning bug. (#11110)
@@ -82,12 +101,15 @@ MXNet Change Log
 - Fix quantized graph pass bug (#11937)
 - Fix MXPredReshape in the c_predict_api (#11493)
 - Fix the topk regression issue (#12197)
+- Fix image-classification example and add missing optimizers w/ momentum support (#11826)
 
 ### Performance Improvements
-- Added static allocation for HybridBloc gluon (#11320)
+- Added static allocation and static shape for HybridBloc gluon (#11320)
 - Fix RecordIO augmentation speed (#11474)
 - Improve sparse pull performance for gluon trainer (#11429)
+- CTC operator performance improvement from HawkAaron/MXNet-CTC (#11834)
 - Improve performance of broadcast ops backward pass (#11252)
+- Improved numerical stability as a result of using stable L2 norm (#11573)
 - Accelerate the performance of topk for CPU side (#12085)
 - Support for dot(dns, csr) = dns and dot(dns, csr.T) = dns on CPU ([#11113](https://github.com/apache/incubator-mxnet/pull/11113))
 - Performance improvement for Batch Dot on CPU from mshadow ([mshadow PR#342](https://github.com/dmlc/mshadow/pull/342))
@@ -121,16 +143,13 @@ MXNet Change Log
 - Add test for new int64 type in CSVIter (#11499)
 - Add sample ratio for ROI Align (#11145)
 - Shape and Size Operator (#10889)
-- Add HybidSequentialRNNCell, which can be nested in HybridBlock (#10514)
+- Add HybidSequentialRNNCell, which can be nested in HybridBlock (#11003)
 - Support for a bunch of unary functions for csr matrices (#11559)
-- Added stable norm 2 Reducer (#11573)
-- Added Synchronized Batch Normalization (#11502)
 - Added NDArrayCollector to dispose intermediate allocated NDArrays automatically (#11751)
 - Added the diag() operator (#11643)
 - Added broadcast_like operator (#11820)
 - Allow Partial shape infer for Slice (#11406)
 - Added support to profile kvstore server during distributed training  (#11215)
-- Make gluon rnn layers hybrid blocks (#11482)
 - Add function for GPU Memory Query to C API (#12083)
 - Generalized reshape_like operator to be more flexible (#11928)
 - Add support for selu activation function (#12059)
@@ -190,7 +209,7 @@ For more information and examples, see [full release notes](https://cwiki.apache
 - Added support for distributed mixed precision training with FP16. It supports storing of master copy of weights in float32 with the multi_precision mode of optimizers (#10183). Improved speed of float16 operations on x86 CPU by 8 times through F16C instruction set. Added support for more operators to work with FP16 inputs (#10125, #10078, #10169). Added a tutorial on using mixed precision with FP16 (#10391).
 
 ### New Features - Added Profiling Enhancements
-- Enhanced built-in profiler to support native Intel:registered: VTune:tm: Amplifier objects such as Task, Frame, Event, Counter and Marker from both C++ and Python -- which is also visible in the Chrome tracing view(#8972). Added Runtime tracking of symbolic and imperative operators as well as memory and API calls. Added Tracking and dumping of aggregate profiling data. Profiler also no longer affects runtime performance when not in use. 
+- Enhanced built-in profiler to support native Intel:registered: VTune:tm: Amplifier objects such as Task, Frame, Event, Counter and Marker from both C++ and Python -- which is also visible in the Chrome tracing view(#8972). Added Runtime tracking of symbolic and imperative operators as well as memory and API calls. Added Tracking and dumping of aggregate profiling data. Profiler also no longer affects runtime performance when not in use.
 
 ### Breaking Changes
 - Changed Namespace for MXNet scala from `ml.dmlc.mxnet` to `org.apache.mxnet` (#10284).
@@ -221,7 +240,7 @@ For more information and examples, see [full release notes](https://cwiki.apache
 - Fixed a bug that was causing training metrics to be printed as NaN sometimes (#10437).
 - Fixed a crash with non positive reps for tile ops (#10417).
 
-### Performance Improvements 
+### Performance Improvements
 - On average, after the MKL-DNN change, the inference speed of MXNet + MKLDNN outperforms MXNet + OpenBLAS by a factor of 32, outperforms MXNet + MKLML by 82% and outperforms MXNet + MKLML with the experimental flag by 8%. The experiments were run for the image classifcation example, for different networks and different batch sizes.
 - Improved sparse SGD, sparse AdaGrad and sparse Adam optimizer speed on GPU by 30x (#9561, #10312, #10293, #10062).
 - Improved `sparse.retain` performance on CPU by 2.5x (#9722)
@@ -326,7 +345,7 @@ For more information and examples, see [full release notes](https://cwiki.apache
 - Added `axis` argument to `SequenceLast`, `SequenceMask` and `SequenceReverse` operators (#9306)
 - Added `lazy_update` option for standard `SGD` & `Adam` optimizer with `row_sparse` gradients (#9468, #9189)
 - Added `select` option in `Block.collect_params` to support regex (#9348)
-- Added support for (one-to-one and sequence-to-one) inference on explicit unrolled RNN models in R (#9022) 
+- Added support for (one-to-one and sequence-to-one) inference on explicit unrolled RNN models in R (#9022)
 ### Deprecations
 - The Scala API name space is still called `ml.dmlc`. The name space is likely be changed in a future release to `org.apache` and might brake existing applications and scripts (#9579, #9324)
 ### Performance Improvements
@@ -372,10 +391,10 @@ For more information and examples, see [full release notes](https://cwiki.apache
   - MXNet now compiles and runs on NVIDIA Jetson TX2 boards with GPU acceleration.
   - You can install the python MXNet package on a Jetson board by running - `$ pip install mxnet-jetson-tx2`.
 ### New Features - Sparse Tensor Support [General Availability]
-  - Added more sparse operators: `contrib.SparseEmbedding`, `sparse.sum` and `sparse.mean`. 
+  - Added more sparse operators: `contrib.SparseEmbedding`, `sparse.sum` and `sparse.mean`.
   - Added `asscipy()` for easier conversion to scipy.
   - Added `check_format()` for sparse ndarrays to check if the array format is valid.
-### Bug-fixes  
+### Bug-fixes
   - Fixed a[-1] indexing doesn't work on `NDArray`.
   - Fixed `expand_dims` if axis < 0.
   - Fixed a bug that causes topk to produce incorrect result on large arrays.
@@ -387,9 +406,9 @@ For more information and examples, see [full release notes](https://cwiki.apache
 ### Doc Updates
   - Added a security best practices document under FAQ section.
   - Fixed License Headers including restoring copyright attributions.
-  - Documentation updates. 
+  - Documentation updates.
   - Links for viewing source.
- 
+
  For more information and examples, see [full release notes](https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.0+Release+Notes)
 
 
@@ -397,15 +416,15 @@ For more information and examples, see [full release notes](https://cwiki.apache
 ### Bug-fixes
   - Added GPU support for the `syevd` operator which ensures that there is GPU support for all linalg-operators.
   - Bugfix for `syevd` on CPU such that it works for `float32`.
-  - Fixed API call when `OMP_NUM_THREADS` environment variable is set. 
+  - Fixed API call when `OMP_NUM_THREADS` environment variable is set.
   - Fixed `MakeNonlossGradNode` bug.
-  - Fixed bug related to passing `dtype` to `array()`. 
+  - Fixed bug related to passing `dtype` to `array()`.
   - Fixed some minor bugs for sparse distributed training.
-  - Fixed a bug on `Slice` accessing uninitialized memory in `param.begin` in the file `matrix_op-inl.h`. 
+  - Fixed a bug on `Slice` accessing uninitialized memory in `param.begin` in the file `matrix_op-inl.h`.
   - Fixed `gluon.data.RecordFileDataset`.
   - Fixed a bug that caused `autograd` to crash on some networks.
-  
-  
+
+
 ## 0.12.0
 ### Performance
   - Added full support for NVIDIA Volta GPU Architecture and CUDA 9. Training CNNs is up to 3.5x faster than Pascal when using float16 precision.
@@ -413,7 +432,7 @@ For more information and examples, see [full release notes](https://cwiki.apache
   - Improved ImageRecordIO image loading performance and added indexed RecordIO support.
   - Added better openmp thread management to improve CPU performance.
 ### New Features - Gluon
-  - Added enhancements to the Gluon package, a high-level interface designed to be easy to use while keeping most of the flexibility of low level API. Gluon supports both imperative and symbolic programming, making it easy to train complex models imperatively with minimal impact on performance. Neural networks (and other machine learning models) can be defined and trained with `gluon.nn` and `gluon.rnn` packages. 
+  - Added enhancements to the Gluon package, a high-level interface designed to be easy to use while keeping most of the flexibility of low level API. Gluon supports both imperative and symbolic programming, making it easy to train complex models imperatively with minimal impact on performance. Neural networks (and other machine learning models) can be defined and trained with `gluon.nn` and `gluon.rnn` packages.
   - Added new loss functions - `SigmoidBinaryCrossEntropyLoss`, `CTCLoss`, `HuberLoss`, `HingeLoss`, `SquaredHingeLoss`, `LogisticLoss`, `TripletLoss`.
   - `gluon.Trainer` now allows reading and setting learning rate with `trainer.learning_rate` property.
   - Added API `HybridBlock.export` for exporting gluon models to MXNet format.
@@ -426,7 +445,7 @@ For more information and examples, see [full release notes](https://cwiki.apache
   - Added `mx.autograd.grad` and experimental second order gradient support (most operators don't support second order gradient yet).
   - Autograd now supports cross-device graphs. Use `x.copyto(mx.gpu(i))` and `x.copyto(mx.cpu())` to do computation on multiple devices.
 ### New Features - Sparse Tensor Support
-  - Added support for sparse matrices. 
+  - Added support for sparse matrices.
   - Added limited cpu support for two sparse formats in `Symbol` and `NDArray` - `CSRNDArray` and `RowSparseNDArray`.
   - Added a sparse dot product operator and many element-wise sparse operators.
   - Added a data iterator for sparse data input - `LibSVMIter`.
@@ -436,7 +455,7 @@ For more information and examples, see [full release notes](https://cwiki.apache
   - Added limited support for fancy indexing, which allows you to very quickly access and modify complicated subsets of an array's values. `x[idx_arr0, idx_arr1, ..., idx_arrn]` is now supported. Features such as combining and slicing are planned for the next release. Checkout master to get a preview.
   - Random number generators in `mx.nd.random.*` and `mx.sym.random.*` now support both CPU and GPU.
   - `NDArray` and `Symbol` now supports "fluent" methods. You can now use `x.exp()` etc instead of `mx.nd.exp(x)` or `mx.sym.exp(x)`.
-  - Added `mx.rtc.CudaModule` for writing and running CUDA kernels from python. 
+  - Added `mx.rtc.CudaModule` for writing and running CUDA kernels from python.
   - Added `multi_precision` option to optimizer for easier float16 training.
   - Better support for IDE auto-completion. IDEs like PyCharm can now correctly parse mxnet operators.
 ### API Changes
@@ -484,14 +503,14 @@ For more information and examples, see [full release notes](https://cwiki.apache
 
 
 ## 0.10.0
-- Overhauled documentation for commonly used Python APIs, Installation instructions, Tutorials, HowTos and MXNet Architecture.  
-- Updated mxnet.io for improved readability.  
-- Pad operator now support reflection padding.  
-- Fixed a memory corruption error in threadedengine.  
-- Added CTC loss layer to contrib package. See mx.contrib.sym.ctc_loss.  
-- Added new sampling operators for several distributions (normal,uniform,gamma,exponential,negative binomial).  
+- Overhauled documentation for commonly used Python APIs, Installation instructions, Tutorials, HowTos and MXNet Architecture.
+- Updated mxnet.io for improved readability.
+- Pad operator now support reflection padding.
+- Fixed a memory corruption error in threadedengine.
+- Added CTC loss layer to contrib package. See mx.contrib.sym.ctc_loss.
+- Added new sampling operators for several distributions (normal,uniform,gamma,exponential,negative binomial).
 - Added documentation for experimental RNN APIs.
- 
+
 ## 0.9.3
 - Move symbolic API to NNVM @tqchen
   - Most front-end C API are backward  compatible


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services