You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/11/17 06:48:59 UTC

[GitHub] [tvm] srkreddy1238 opened a new pull request, #13413: [RUNTIME][OPENCL] OpenCL host pointer support to acheive zero copy

srkreddy1238 opened a new pull request, #13413:
URL: https://github.com/apache/tvm/pull/13413

   OpenCL supports device memory access to host by memory mapping. OpenCL flag "CL_MEM_ALLOC_HOST_PTR" enable this while creating a memory object.
   
   We enable this feature via compilation setting "USE_OPENCL_ENABLE_HOST_PTR" followed by a new API "GetNativePtr" on DeviceAPI followed by NDArray class.
   
   This allows application directly use hardware allocated memory while preparing the input. From user side we allocate NDArray which same size as graph input, access native memory and finally call set_input_zero_copy to set the input.
   
   Psudo code looks like
   
   auto narr = tvm::runtime::NDArray::Empty(shape, {kDLFloat, 32, 1}, {kDLOpenCL, 0}); void * nptr = narr.GetNativePtr();
   
   ... access memory pointed by nptr up to the tensor size ...
   
   tvm::runtime::PackedFunc set_input = mod.GetFunction("set_input_zero_copy"); set_input(i, ninput);


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] tqchen commented on pull request #13413: [RUNTIME][OPENCL] OpenCL host pointer support to acheive zero copy

Posted by GitBox <gi...@apache.org>.
tqchen commented on PR #13413:
URL: https://github.com/apache/tvm/pull/13413#issuecomment-1354739416

   Thanks @srkreddy1238 Indeed I am not questioning the usefulness of having the NativePtr. 
   
   Just the specificity of it would benefit from PackedFunc or OpenCL specific functionality first


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] tqchen commented on pull request #13413: [RUNTIME][OPENCL] OpenCL host pointer support to acheive zero copy

Posted by GitBox <gi...@apache.org>.
tqchen commented on PR #13413:
URL: https://github.com/apache/tvm/pull/13413#issuecomment-1319994786

   Thanks for the PR. In this particular case, it would be great to think about other means that do not necessarily make surgical changes to the DeviceAPI level or NDArray level. Exposing a GetHostPtr from opencl runtime or packed func could be a good starting pt, before motivating a device api change 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] srkreddy1238 commented on pull request #13413: [RUNTIME][OPENCL] OpenCL host pointer support to acheive zero copy

Posted by GitBox <gi...@apache.org>.
srkreddy1238 commented on PR #13413:
URL: https://github.com/apache/tvm/pull/13413#issuecomment-1318300674

   In general OpenCL global memory and host accessible memory points to DDR (Common system wide physical memory). We get zero copy advantage only if DDR is shared between the two cores.
   
   Mapping multiple memory objects to a process address space won't cause any performance hit (unless there is writes from both sides on the mapped segment). In our case only the input mem object is written by host and others are untouched.
   
   Cmake compilation option is to avoid any unexpected behaviors due to custom hardware & driver implementations.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] srkreddy1238 commented on pull request #13413: [RUNTIME][OPENCL] OpenCL host pointer support to acheive zero copy

Posted by GitBox <gi...@apache.org>.
srkreddy1238 commented on PR #13413:
URL: https://github.com/apache/tvm/pull/13413#issuecomment-1323326740

   Thanks for the review.
   
   TVM benchmarks generally evaluate ```run``` call ignoring the ```set_input``` & ```get_output```. There exist a significant end to end performance overhead caused due to input/output (copes and also using different input buffer every time affects cache too). This was very evident when I benchmarked TVM model over MLPerf android app.
   
   Buffer sharing is well known practice and is supported by most of the edge platforms across cores like Camera ISP, GPU, CPU...etc. Motivation here is to encourage the runtime backends to support Native Ptr access. This can retain the TVM performance numbers at final application level with less overheads.
   
   I am good with packed function also for now until there is more demand to expose native buffers to applications via NDArray.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] tvm-bot commented on pull request #13413: [RUNTIME][OPENCL] OpenCL host pointer support to acheive zero copy

Posted by GitBox <gi...@apache.org>.
tvm-bot commented on PR #13413:
URL: https://github.com/apache/tvm/pull/13413#issuecomment-1318167317

   <!---bot-comment-->
   
   Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @-ing them in a comment.
   
   <!--bot-comment-ccs-start-->
    * cc @areusch, @echuraev, @elvin-n <sub>See [#10317](https://github.com/apache/tvm/issues/10317) for details</sub><!--bot-comment-ccs-end-->
   
   <sub>Generated by [tvm-bot](https://github.com/apache/tvm/blob/main/ci/README.md#github-actions)</sub>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] tqchen merged pull request #13413: [RUNTIME][OPENCL] OpenCL host pointer support to acheive zero copy

Posted by GitBox <gi...@apache.org>.
tqchen merged PR #13413:
URL: https://github.com/apache/tvm/pull/13413


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org