You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/05/28 00:04:37 UTC

[GitHub] [tvm] Lunderberg opened a new pull request #8157: [Vulkan][Refactor] Split out vulkan.cc into separate distinct functionality

Lunderberg opened a new pull request #8157:
URL: https://github.com/apache/tvm/pull/8157


   This is in preparation for additional refactoring.  Functions are organized according to group similar functionality together, to minimize the amount of file-to-file transfers needed later.  The main divisions are between VulkanDeviceAPI, VulkanModuleNode/VulkanWrappedFunc, VulkanThreadEntry, and VulkanContext.
   
   Other than minimal renaming of private functions and the addition of some documenting comments, this commit should have zero changes to the functions definitions themselves, only to their arrangement within the `src/runtime/vulkan` directory.
   
   
   This rearrangement is in preparation for the following planned changes.  (These changes are **not implemented** in the current PR, but are mentioned here as the motivation for why the functionality is grouped as it is.  Each is intended to be a separate PR following this one, with the file rearrangement done such that each later change will touch the minimum number of files.)
   
   * New class, VulkanInstance
   
     Currently, `VulkanDeviceAPI` constructs the VkInstance in the constructor, then destroys in the destructor.  If an exception is thrown after `vkCreateInstance`, before the end of the constructor, the `VulkanDeviceAPI` destructor and `vkDestroyInstace` is never called.  This can cause segfaults on NV drivers.
   
     `VulkanInstance` class to be constructed in `VulkanDeviceAPI`'s constructor.  Last line of `VulkanInstance` constructor calls `vkCreateInstance`, and the destructor calls `vkDestroyInstance`.
   
   * Rename, `VulkanContext` -> `VulkanDevice`
   
     In keeping with the general renaming of `tvm.context` to `tvm.device`, applying the same here.
     
   * VulkanDevice, constructor/destructor
     
     In constructor, initialize itself based on a `VkPhysicalDevice`.  In destructor, call `vkDestroyDevice`.  These are pulled out of the `VulkanDeviceAPI` constructor.
     
   * VulkanBuffer, constructor/destructor
   
     `CreateBuffer` function becomes a method of `VulkanDevice`, returns a buffer object.  Each `VulkanBuffer` object is owned by the `VulkanDevice` that created it.  The `VulkanBuffer` objects can be moved, but not copied, so a `VulkanBuffer*` can still be returned by `AllocDataSpace`.  The `VulkanBuffer` calls `vkDestroyBuffer` and `vkFreeMemory` in its destructor.
     
   * Remove maps from VulkanThreadEntry
   
     Some device-specific parameters are stored in `VulkanDevice`, while others are stored in `VulkanThreadEntry`.  This would move all device-specific parameters into a single location.  The `VulkanDevice` would hold a `ThreadLocalStore` with a `VulkanStream`, `VulkanStagingBuffer`, and `VulkanUnformBuffer` in order to maintain separate per-device parameters.
   
     The active device and the `WorkspacePool` would remain with the `VulkanThreadEntry`, since these are not device-specific.
   
   * Unify `Launch` and `LaunchDeferred`
   
     As far as I can tell, the reason to have `LaunchDeferred` is to prevent multiple kernels that have the same descriptor set from submitted to the GPU at the same time.  `Launch` is allowed to push to the command buffer without any checks.  This requires the calling scope to know many internals of the `VulkanStream`, such as knowing that the deferred_initializer should only ever set descriptor sets.
   
     Instead, having `VulkanStream::Launch` accept an argument of which descriptors are needed (the `std::vector<VkDescriptorBufferInfo>`). This pulls most of the UseImmediate-specific logic into a single place.
   
     Also, at this point, rename `UseImmediate` to `UsePushDescriptors`. `UseImmediate` suggests that the mode is similar to immediate mode of OpenGL or Direct2D, which is not the case.  Whether or not push descriptors are used, the command buffer isn't submitted to the GPU until the `vkQueueSubmit` call in `VulkanStream::Synchronize`.
     
   * Track uniform buffer bytes in the VulkanStream
   
     Along with the descriptor sets, the uniform buffer space is also shared between all kernels simultaneously submitted to the GPU and using the same the uniform buffer.  In the current implementation, if two consecutive functions each require UBO for scalar inputs, the second one will overwrite the values of the first.
     
     Instead, we should allocate the maximum uniform buffer range allowed on a GPU, then maintain a count of how many bytes are in use. Similar to how the current `LaunchDeferred` implementation handles conflicting descriptor sets, if a new kernel cannot be submitted without overflowing the range available, make a call to `Synchronize`.
     
   * Move `VulkanPipeline` cleanup to destructor
   
     Currently in `VulkanModuleNode` destructor, would be implemented in `VulkanPipeline` destructor instead.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] Lunderberg commented on a change in pull request #8157: [Vulkan][Refactor] Split out vulkan.cc into separate distinct functionality

Posted by GitBox <gi...@apache.org>.
Lunderberg commented on a change in pull request #8157:
URL: https://github.com/apache/tvm/pull/8157#discussion_r641561663



##########
File path: src/runtime/vulkan/vulkan_stream.h
##########
@@ -43,135 +44,64 @@ struct VulkanStreamToken {
   std::vector<VkBuffer> buffers_;
 };
 
+/*!
+ *  \brief Wrapper around a vulkan command buffer
+ *
+ *  The VulkanStream collects commands into a VkCommandBuffer.  When a
+ *  newly submitted command requires resources reserved by an
+ *  already-submitted command, all of the queued commands are
+ *  submitted to the GPU, and the CPU waits for all queued commands to
+ *  finish.  The queued commands can also be explicitly pushed/waited
+ *  on by calling VulkanStream::Synchronize.
+ *
+ *  If the device is

Review comment:
       Thank you for that catch, and next time I should probably separate out the documentation from the refactoring to make those items be easier to find ahead of time.  I have no memory of what I was going to finish that paragraph with, so I have deleted it for now.
   
   Also, fixed up the linting errors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on pull request #8157: [Vulkan][Refactor] Split out vulkan.cc into separate distinct functionality

Posted by GitBox <gi...@apache.org>.
masahi commented on pull request #8157:
URL: https://github.com/apache/tvm/pull/8157#issuecomment-850739433


   thanks @Lunderberg @tmoreau89 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tmoreau89 commented on a change in pull request #8157: [Vulkan][Refactor] Split out vulkan.cc into separate distinct functionality

Posted by GitBox <gi...@apache.org>.
tmoreau89 commented on a change in pull request #8157:
URL: https://github.com/apache/tvm/pull/8157#discussion_r641274980



##########
File path: src/runtime/vulkan/vulkan_stream.h
##########
@@ -43,135 +44,64 @@ struct VulkanStreamToken {
   std::vector<VkBuffer> buffers_;
 };
 
+/*!
+ *  \brief Wrapper around a vulkan command buffer
+ *
+ *  The VulkanStream collects commands into a VkCommandBuffer.  When a
+ *  newly submitted command requires resources reserved by an
+ *  already-submitted command, all of the queued commands are
+ *  submitted to the GPU, and the CPU waits for all queued commands to
+ *  finish.  The queued commands can also be explicitly pushed/waited
+ *  on by calling VulkanStream::Synchronize.
+ *
+ *  If the device is

Review comment:
       Nice work with the additional comments here, but it seems this paragraph is incomplete!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi merged pull request #8157: [Vulkan][Refactor] Split out vulkan.cc into separate distinct functionality

Posted by GitBox <gi...@apache.org>.
masahi merged pull request #8157:
URL: https://github.com/apache/tvm/pull/8157


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] Lunderberg commented on pull request #8157: [Vulkan][Refactor] Split out vulkan.cc into separate distinct functionality

Posted by GitBox <gi...@apache.org>.
Lunderberg commented on pull request #8157:
URL: https://github.com/apache/tvm/pull/8157#issuecomment-850020388


   Potential reviewers: @masahi @tmoreau89 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org