You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/09/22 21:34:57 UTC
[GitHub] [incubator-mxnet] cyrusbehr opened a new issue #20600: How to efficiently run inference with variable batch sizes with mxnet using C++ API?

cyrusbehr opened a new issue #20600:
URL: https://github.com/apache/incubator-mxnet/issues/20600


   I'm using mxnet 1.8.0.
   
   How can I efficiently deal with variable input batch sizes with Mxnet using the C++ API? 
   
   Initially, my inference code looks something like follows: 
   ```
   ErrorCode FaceRecognizerGPU::getFaceFeatureVectors(std::vector<cv::cuda::GpuMat> &alignedFaceImages, std::vector<Faceprint>& faceprints) {
       // auto t1 = Clock::now();
       auto data = AsData(alignedFaceImages, ctx);
   
       if (exec != nullptr) {
           if (args["data"].GetShape()[0] != alignedFaceImages.size()) {
               delete exec;
   
               args["data"] = NDArray(Shape(alignedFaceImages.size(), 3, 112, 112), ctx, false);
   
               exec = net.SimpleBind(
               ctx, args, std::map<std::string, NDArray>(),
               std::map<std::string, OpReqType>(), auxs);
           }
       }
   
       data.CopyTo(&(exec->arg_dict()["data"]));
   
       exec->Forward(false);
   
       auto embeddings = exec->outputs[0].Copy(Context(kCPU, 0));
       embeddings.WaitToRead();
       
       // Rest of code here....
   }
   ```
   
   The issue with the above is that any time the batch size changes, the executor is deleted and `SimpleBind` is run with the new input size. Doing so is a slow operation (as I assume it's allocating GPU memory and doing some other stuff under the hood), and therefore rapidly switching between batch sizes becomes quite slow.
   
   Therefore, what I'd like is to use a scheme where the executor is only deleted and reinstatiated if the batch size increases. If the batch size decreases, we can use the existing allocated GPU memory and no need to delete and re-allocate GPU memory (though we would probably need to specify that the input shape has changed somehow). I know this can be done with TensorRT, but I'm not sure how exactly to implement this in Mxnet. I was hoping I could do something as follows:
   
   ```
   ErrorCode FaceRecognizerGPU::getFaceFeatureVectors(std::vector<cv::Mat> &alignedFaceImages, std::vector<Faceprint>& faceprints) {
       auto data = AsData(alignedFaceImages, ctx);
   
       if (!m_exec || m_exec->arg_dict()["data"].GetShape()[0] < alignedFaceImages.size()) {
           if (m_exec){
               delete m_exec;
           }
   
           m_args["data"] = NDArray(Shape(alignedFaceImages.size(), 3, 112, 112), ctx, false);
   
           m_exec = m_net.SimpleBind(
                   ctx, m_args, std::map<std::string, NDArray>(),
                   std::map<std::string, OpReqType>(), m_auxs);
   
       }
   
       data.CopyTo(&(m_exec->arg_dict()["data"]));
   
       m_exec->Forward(false);
       auto embeddings = m_exec->outputs[0].Copy(Context(kCPU, 0));
       embeddings.WaitToRead();
       // Rest of code here....
   }
   
   ``` 
   
   However, it fails when the batch size is decreased b/c the input size doesn't match what is expected (hence we need to specify somehow that the input shape has changed, without changing the GPU memory allocation):
   ```
   
   [14:15:27] /home/cyrus/work/c-sdks/3rd_party_libs/mxnet/build_cuda_11/packaged/include/mxnet-cpp/operator.hpp:141: MXNetError: Check failed: assign(&dattr, vec.at(i)): Incompatible attr in node  at 0-th output: expected [46,3,112,112], got [48,3,112,112]
   ```
   
   Anyone know how this can be done? 
   I think it may be done with the `mxnet::cpp::Executor::Reshape` function but I can't find any examples on how it's done. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org