You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@singa.apache.org by "Chris Yeung (Jira)" <ji...@apache.org> on 2020/11/14 06:34:00 UTC

[jira] [Updated] (SINGA-512) Support of fused ops to increase throughput performance

     [ https://issues.apache.org/jira/browse/SINGA-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Yeung updated SINGA-512:
------------------------------
    Description: 
In Cudnn 7.6, a new API is introduced for fused ops, which can accelerate many use cases in ResNet-like networks. With this new API it is now possible to execute various fused operations such as apply per channel scale and bias, perform activation, compute convolution, and generate batchnorm statistics. 

Reference:
https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_7xx.html#rel_760
 
The goal is to increase the throughput of DL networks. Currently, this task is assigned to Naili.

  was:
In Cudnn 7.6, a new API is introduced for fused ops, which can accelerate many use cases in ResNet-like networks. With this new API it is now possible to execute various fused operations such as apply per channel scale and bias, perform activation, compute convolution, and generate batchnorm statistics. 

Reference:
https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_7xx.html#rel_760
 
The goal is to increase the image throughput of DL networks. Currently, this task is assigned to Naili.


> Support of fused ops to increase throughput performance
> -------------------------------------------------------
>
>                 Key: SINGA-512
>                 URL: https://issues.apache.org/jira/browse/SINGA-512
>             Project: Singa
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Yeung
>            Priority: Major
>
> In Cudnn 7.6, a new API is introduced for fused ops, which can accelerate many use cases in ResNet-like networks. With this new API it is now possible to execute various fused operations such as apply per channel scale and bias, perform activation, compute convolution, and generate batchnorm statistics. 
> Reference:
> https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_7xx.html#rel_760
>  
> The goal is to increase the throughput of DL networks. Currently, this task is assigned to Naili.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)