You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/12/10 05:07:35 UTC

[GitHub] eric-haibin-lin opened a new issue #13598: More fine-grained operator implementation dispatch & memory planning flow

eric-haibin-lin opened a new issue #13598: More fine-grained operator implementation dispatch & memory planning flow 
URL: https://github.com/apache/incubator-mxnet/issues/13598
 
 
   ## Existing Execution Flow
   ```
   g = graph()
   shapes = g.infer_shape()
   types = g.infer_type()
   storage_types, dispatch_modes = g.infer_storage_type()
   memory_plan = nnvm::plan_memory() // which calls node.finplace_option(node.attr)
   for node in g:
     fcompute = get_fcompute(node)
     fcompute(x)
   ```
   ### The drawback of the existing flow
   - the selection of MKL/CPU/GPU/CUDNN implementation happens after graph attribute inference and memory planning, **memory planning is thus not aware of the implementation that will be used for execution in the future, which may result in sub-optimal result**. For example, the memory inplace option may vary depending on the accelerator backend (the new version of CUDNN enables x/dx inplace for _backward_conv).
   - some sparse operator need to access dtype/shape information to decide which implementation to invoke for execution, and whether to perform fallback. This information is not yet exposed in the existing infer storage type interface. 
   
   ## Alternative Flow
   op implementations
   ```
   CovolutionComputeCUDNN(const nnvm::NodeAttrs& attrs,
                          const OpContext& ctx,
                          const std::vector<TBlob>& inputs,
                          const std::vector<OpReqType>& req,
                          const std::vector<TBlob>& outputs) {
     // 1st CUDNN implementation goes here 
   }
   
   CovolutionComputeMKL(const nnvm::NodeAttrs& attrs,
                          const OpContext& ctx,
                          const std::vector<NDArray>& inputs,
                          const std::vector<OpReqType>& req,
                          const std::vector<NDArray>& outputs) {
     // MKL implementation goes here 
   }
   
   CovolutionComputeImplGPU(const nnvm::NodeAttrs& attrs,
                          const OpContext& ctx,
                          const std::vector<TBlob>& inputs,
                          const std::vector<OpReqType>& req,
                          const std::vector<TBlob>& outputs) {
     // GPU implementation goes here 
   }
   
   CovolutionComputeImplCPU(const nnvm::NodeAttrs& attrs,
                          const OpContext& ctx,
                          const std::vector<TBlob>& inputs,
                          const std::vector<OpReqType>& req,
                          const std::vector<TBlob>& outputs) {
     // CPU implementation goes here 
   }
   ```
   new finferstorage_ex interface
   ```
   void  FInferStorageTypeEx(const std::vector<TShape>& in_shapes,
                             const std::vector<int>& in_types,
                             const std::vector<int>& in_stype,
                             const std::vector<TShape>& out_shape,
                             const std::vector<int>& out_type,
                             std::vector<int>* out_stype,
                             int dev_mask,
                             NodeAttrs* attrs, // mutable
                             DispatchMode* dispatch_mode // mutable) {
       // GPU
       if (dev_mask == kGPU) {
         out_stype[0] = kDefaultStorage;
         dispatch_mode = kFCompute;
   #if MXNET_USE_CUDNN
         if (attrs.params.kernel.ndim() == 2 && dtype == float && in_shape[0].ndim() == 1 && …) {
           attrs.exec_func = CovolutionComputeImplCUDNN;
         } else {
           attrs.exec_func = CovolutionComputeImplGPU;
         }
   #else 
       attrs.exec_func = CovolutionComputeImplGPU;
   #endif
       // CPU
       } else {
   #if MXNET_USE_MKLDNN
       attrs.exec_func = CovolutionComputeMKL
       out_stype[0] = kDefaultStorage;
       dispatch_mode = kFComputeEx;
   #else
       attrs.exec_func = CovolutionComputeCPU
       ...
   #endif
   }
   ```
   FInplaceOption for convolution
   ```
   [] FInplaceOption(const NodeAttrs& attrs) {
     if (attrs.exec_func == CovolutionComputeCUDNN) {
       return {0,0};
     } else {
       return {}
     }
   }
   ```
   New Execution Flow:
   ```
   g = graph()
   shapes = g.infer_shape()
   types = g.infer_type()
   if (g.has_attr('FInferStorageTypeEx')) {
     storage_types, dispatch_modes = g.infer_storage_type_ex()
   } else {
     storage_types, dispatch_modes = g.infer_storage_type()
   }
   memory_plan = nnvm::plan_memory() // which calls node.finplace_option(node.attr)
   for node in g:
     if (node.attrs.exec_func) {
       fcompute = node.attrs.exec_func
     } else {
       fcompute = get_fcompute(node)
     }
     fcompute(x)
   ```
   
   @DickJC123 @ptrendx @piiswrong @reminisce 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services