You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/04/10 11:12:07 UTC

[GitHub] [tvm] avulisha opened a new pull request, #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

avulisha opened a new pull request, #10953:
URL: https://github.com/apache/tvm/pull/10953

RGP(Raedon GPU Profiler) is a tool used to analyze the applications
run on AMD GPU. RGP captures the data based on VKPresent and provides
the hardware specific information. Allowing the developer to optimize
the application. To add RGP support to TVM, debug labels "AmdFrameBegin"
and "AmdFrameEnd" need to be inserted into the vulkan queue.These Labels
helps the RGP tool to understand the start|end of frame when no present
is available. Thus enabling the RGP tool to capture and analyze the data.

At runtime, set the envirnoment variable "TVM_USE_AMD_RGP=1" to start
inserting the Debug Labels into the vulkan queue.

Signed-off-by: Wilkin Chau <Wi...@amd.com>
Signed-off-by: Anurag Kumar Vulisha <An...@amd.com>

Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @ them in the pull request thread.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] avulisha commented on a diff in pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

avulisha commented on code in PR #10953:
URL: https://github.com/apache/tvm/pull/10953#discussion_r847321958


##########
src/runtime/vulkan/vulkan_stream.cc:
##########
@@ -55,11 +55,15 @@ VulkanStream::VulkanStream(const VulkanDevice* device)
   cb_begin.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
   cb_begin.pInheritanceInfo = 0;
   VULKAN_CALL(vkBeginCommandBuffer(state_->cmd_buffer_, &cb_begin));
+
+  profiler_ = new AmdRgpProfiler(device_);

Review Comment:
   > Please do this only when RGP is enabled.
   
   Okay.  Will Implement your suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

masahi commented on PR #10953:
URL: https://github.com/apache/tvm/pull/10953#issuecomment-1097351516

   Hi @avulisha (cc @mei-ye), I want to try this. What is your typical workflow? For example, I want to capture the trace from running https://github.com/apache/tvm/blob/main/apps/topi_recipe/gemm/cuda_gemm_square.py. 
   
   It looks like I need to press "Capture profile" button in the profiler UI, but the script quickly finishes before I am able to start capturing. I do see `tvm/src/runtime/vulkan/vulkan_instance.cc:65: Push VK_EXT_debug_utils` logged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] avulisha commented on a diff in pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

avulisha commented on code in PR #10953:
URL: https://github.com/apache/tvm/pull/10953#discussion_r847321634


##########
src/runtime/vulkan/vulkan_instance.cc:
##########
@@ -59,6 +59,14 @@ VulkanInstance::VulkanInstance() {
     std::vector<const char*> required_extensions{};
     std::vector<const char*> optional_extensions{"VK_KHR_get_physical_device_properties2"};
 
+    // Check if RGP support is needed. If needed, enable VK_EXT_debug_utils extension for
+    // inserting debug labels into the queue.
+    const char* val = getenv("TVM_USE_AMD_RGP");

Review Comment:
   > Use `BoolEnvironmentVar`
   > 
   > https://github.com/apache/tvm/blob/ef7143e1fc1a12993f584ce37dbde38b75966b40/src/runtime/vulkan/vulkan_instance.cc#L36
   
   Hi @masahi,
   Thanks for your time in reviewing the changes.  Will Implement your suggestion.
   Thanks,
   Anurag



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] avulisha commented on pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

avulisha commented on PR #10953:
URL: https://github.com/apache/tvm/pull/10953#issuecomment-1097565953

   > Hi @avulisha (cc @mei-ye), I want to try this. What is your typical workflow? For example, I want to capture the trace from running https://github.com/apache/tvm/blob/main/apps/topi_recipe/gemm/cuda_gemm_square.py.
   > 
   > It looks like I need to press "Capture profile" button in the profiler UI, but the script quickly finishes before I am able to start capturing. So I'm wondering how you typically workaround that issue. I do see `tvm/src/runtime/vulkan/vulkan_instance.cc:65: Push VK_EXT_debug_utils` logged.
   
   Hi @masahi ,
   As Mei was mentioning, the run is very short for the RGP tool to capture the traces. For testing, we can use the frontend tests to capture the traces. "TVM_FFI=ctypes python -m pytest -v tests/python/frontend/onnx/test_forward.py"
   There are many tests that would be run as a part of frontend tests, allowing the RGP tool to capture the traces.
   Thanks,
   Anurag
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] avulisha commented on a diff in pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

avulisha commented on code in PR #10953:
URL: https://github.com/apache/tvm/pull/10953#discussion_r848366715


##########
src/runtime/vulkan/vulkan_stream.h:
##########
@@ -110,6 +125,7 @@ class VulkanStream {
   std::unordered_map<VkDescriptorSet, std::vector<VulkanStreamToken>> deferred_tokens_;
   std::vector<std::function<void(VulkanStreamState*)>> deferred_kernels_;
   VkCommandPool cmd_pool_;
+  VulkanStreamProfiler* profiler_;

Review Comment:
   Hi @masahi,
   We have implemented the AMD-specific trace capture logic (debug labels insertion) into a separate class AmdRgpProfiler, which is derived from VulkanStreamProfiler. During the VulkanStream creation, the AmdRgpProfiler is instantiated and assigned to VulkanStreamProfiler *profiler. Doing this will help in overriding the default VulkanStreamProfiler->capture() with the AmdRgpProfiler->capture().  If we modify the VulkanStreamProfiler *profiler_ to VulkanStreamProfiler profiler_, the overriding of VulkanStreamProfiler ->capture() with AmdRgpProfiler->capture() will not work. Because of this reason, I would suggest using the existing flow.
   Thanks,
   Anurag



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] avulisha commented on pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

avulisha commented on PR #10953:
URL: https://github.com/apache/tvm/pull/10953#issuecomment-1095424252

   Hi @masahi,
   Thanks for your time in reviewing the changes. I have pushed the changes that you have suggested.
   Best Regards,
   Anurag


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

masahi commented on PR #10953:
URL: https://github.com/apache/tvm/pull/10953#issuecomment-1097860668

   Thanks! @avulisha 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #10953:
URL: https://github.com/apache/tvm/pull/10953#discussion_r849313832


##########
src/runtime/vulkan/vulkan_stream.h:
##########
@@ -110,6 +125,7 @@ class VulkanStream {
   std::unordered_map<VkDescriptorSet, std::vector<VulkanStreamToken>> deferred_tokens_;
   std::vector<std::function<void(VulkanStreamState*)>> deferred_kernels_;
   VkCommandPool cmd_pool_;
+  VulkanStreamProfiler* profiler_;

Review Comment:
   yeah sorry, I missed that ` VulkanStreamProfiler` is a base class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi merged pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

masahi merged PR #10953:
URL: https://github.com/apache/tvm/pull/10953


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #10953:
URL: https://github.com/apache/tvm/pull/10953#discussion_r846844600


##########
src/runtime/vulkan/vulkan_instance.cc:
##########
@@ -59,6 +59,14 @@ VulkanInstance::VulkanInstance() {
     std::vector<const char*> required_extensions{};
     std::vector<const char*> optional_extensions{"VK_KHR_get_physical_device_properties2"};
 
+    // Check if RGP support is needed. If needed, enable VK_EXT_debug_utils extension for
+    // inserting debug labels into the queue.
+    const char* val = getenv("TVM_USE_AMD_RGP");

Review Comment:
   Use `BoolEnvironmentVar`
   https://github.com/apache/tvm/blob/ef7143e1fc1a12993f584ce37dbde38b75966b40/src/runtime/vulkan/vulkan_instance.cc#L36



##########
src/runtime/vulkan/vulkan_wrapped_func.cc:
##########
@@ -164,6 +164,15 @@ void VulkanWrappedFunc::operator()(TVMArgs args, TVMRetValue* rv,
     deferred_token.buffers_[i] = descriptor_buffers[i].buffer;
   }
   device.ThreadLocalStream().LaunchDeferred(deferred_initializer, deferred_kernel, deferred_token);
+
+  if (device.UseDebugUtilsLabel()) {
+    VkDebugUtilsLabelEXT dispatch_label = {VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT,
+                                           NULL,
+                                           func_name_.c_str(),
+                                           {0.0f, 0.0f, 0.0f, 0.0f}};
+    device.queue_insert_debug_utils_label_functions->vkQueueInsertDebugUtilsLabelEXT(
+        device.Queue(), &dispatch_label);
+  }

Review Comment:
   Can you add this to the immediate-mode path as well? https://github.com/apache/tvm/blob/f18d8d40bab7218a79392b56cf3d7cbcf6917e47/src/runtime/vulkan/vulkan_wrapped_func.cc#L73
   
   RADV supports push_descriptor so when we run on RADV, we are using that path. And RADV apparently has some support for RGP.



##########
src/runtime/vulkan/vulkan_stream.h:
##########
@@ -110,6 +125,7 @@ class VulkanStream {
   std::unordered_map<VkDescriptorSet, std::vector<VulkanStreamToken>> deferred_tokens_;
   std::vector<std::function<void(VulkanStreamState*)>> deferred_kernels_;
   VkCommandPool cmd_pool_;
+  VulkanStreamProfiler* profiler_;

Review Comment:
   I think this can live on the stack?



##########
src/runtime/vulkan/vulkan_stream.cc:
##########
@@ -55,11 +55,15 @@ VulkanStream::VulkanStream(const VulkanDevice* device)
   cb_begin.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
   cb_begin.pInheritanceInfo = 0;
   VULKAN_CALL(vkBeginCommandBuffer(state_->cmd_buffer_, &cb_begin));
+
+  profiler_ = new AmdRgpProfiler(device_);

Review Comment:
   Please do this only when RGP is enabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] avulisha commented on a diff in pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

avulisha commented on code in PR #10953:
URL: https://github.com/apache/tvm/pull/10953#discussion_r847322558


##########
src/runtime/vulkan/vulkan_stream.h:
##########
@@ -110,6 +125,7 @@ class VulkanStream {
   std::unordered_map<VkDescriptorSet, std::vector<VulkanStreamToken>> deferred_tokens_;
   std::vector<std::function<void(VulkanStreamState*)>> deferred_kernels_;
   VkCommandPool cmd_pool_;
+  VulkanStreamProfiler* profiler_;

Review Comment:
   > I think this can live on the stack?
   
   The profiler_ is also used from the vulkan_device_api.cc to set/reset the flags used for inserting debug labels into the vulkan stream. Since the profiler_ is used in other files, moving profiler_ into the stack may not work. Please correct if my understanding is wrong



##########
src/runtime/vulkan/vulkan_wrapped_func.cc:
##########
@@ -164,6 +164,15 @@ void VulkanWrappedFunc::operator()(TVMArgs args, TVMRetValue* rv,
     deferred_token.buffers_[i] = descriptor_buffers[i].buffer;
   }
   device.ThreadLocalStream().LaunchDeferred(deferred_initializer, deferred_kernel, deferred_token);
+
+  if (device.UseDebugUtilsLabel()) {
+    VkDebugUtilsLabelEXT dispatch_label = {VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT,
+                                           NULL,
+                                           func_name_.c_str(),
+                                           {0.0f, 0.0f, 0.0f, 0.0f}};
+    device.queue_insert_debug_utils_label_functions->vkQueueInsertDebugUtilsLabelEXT(
+        device.Queue(), &dispatch_label);
+  }

Review Comment:
   > Can you add this to the immediate-mode path as well?
   > 
   > https://github.com/apache/tvm/blob/f18d8d40bab7218a79392b56cf3d7cbcf6917e47/src/runtime/vulkan/vulkan_wrapped_func.cc#L73
   > 
   > RADV supports push_descriptor so when we run on RADV, we are using that path. And RADV apparently has some support for RGP.
   
   Thanks for the suggestion. I will make the changes and send them for review



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #10953:
URL: https://github.com/apache/tvm/pull/10953#discussion_r847720123


##########
src/runtime/vulkan/vulkan_stream.h:
##########
@@ -110,6 +125,7 @@ class VulkanStream {
   std::unordered_map<VkDescriptorSet, std::vector<VulkanStreamToken>> deferred_tokens_;
   std::vector<std::function<void(VulkanStreamState*)>> deferred_kernels_;
   VkCommandPool cmd_pool_;
+  VulkanStreamProfiler* profiler_;

Review Comment:
   I mean, can we just use
   
   `VulkanStreamProfiler profiler_`
   
   and delete new / delete stuff



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #10953:
URL: https://github.com/apache/tvm/pull/10953#discussion_r847720123


##########
src/runtime/vulkan/vulkan_stream.h:
##########
@@ -110,6 +125,7 @@ class VulkanStream {
   std::unordered_map<VkDescriptorSet, std::vector<VulkanStreamToken>> deferred_tokens_;
   std::vector<std::function<void(VulkanStreamState*)>> deferred_kernels_;
   VkCommandPool cmd_pool_;
+  VulkanStreamProfiler* profiler_;

Review Comment:
   I mean, can we just use
   
   `VulkanStreamProfiler profiler_`
   
   and remove `new / delete` stuff



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] mei-ye commented on pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

mei-ye commented on PR #10953:
URL: https://github.com/apache/tvm/pull/10953#issuecomment-1097562967

   To successfully capture a trace, it requires at least five complete Present events.   Since the inference time is very short (in 10s of ms),  a loop with many iterations is normally required to ensure that the capture is completed before the process is terminated. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] avulisha commented on pull request #10953: [Runtime][Vulkan] Add RGP support to TVM for vulkan device

Posted by GitBox <gi...@apache.org>.

avulisha commented on PR #10953:
URL: https://github.com/apache/tvm/pull/10953#issuecomment-1097904668

   > Thanks! @avulisha
   Hi @masahi.
   Thanks for your time in reviewing the changes and merging them.
   Best Regards,
   Anurag


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org