You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mxnet.apache.org by Dick Carter <no...@github.com> on 2020/08/05 04:16:36 UTC

Re: [apache/incubator-mxnet] [RFC] v1.8.0 release (#18800)

A major feature of CUDA 11 and cuDNN 8.0 is support for the new A100 GPU and its TensorFloat-32 (TF32) mode of computation.  I would like to include PR https://github.com/apache/incubator-mxnet/pull/18694, "Unittest tolerance handling improvements", which allows MXNet to use TF32 effectively.  The PR also makes sensible adjustments to the unittest tolerances based on device context and dtype, ensuring A100 compatibility with our unittest suite.

With cuDNN 8.0 also comes compatibility with CUDA Graph Capture- I would like to include a PR (near complete, but not yet submitted) that enables CUDA Graph use.  This will permit MXNet to bypass much of the CPU preparation for launching identical kernel sequences, as are commonly seen in many deep learning training and inferencing environments.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/18800#issuecomment-668969957