You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/09/07 21:36:59 UTC

[GitHub] [tvm] janetsc opened a new pull request, #12727: WIP [Hexagon] [runtime] First pass at improving runtime resource management

janetsc opened a new pull request, #12727:
URL: https://github.com/apache/tvm/pull/12727

   Improves runtime buffer management
   
   - Adds entry points to "acquire" and "release" resources in the device API object.
   - Ensures that the buffer manager is empty on acquisition.
   - Releases any remaining buffers on release.
   
   Future changes:
   
   - Unit test
   - Add thread manager


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r972508664


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -50,6 +50,25 @@ class HexagonDeviceAPI final : public DeviceAPI {
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Ensures resource managers are in a good state for the runtime
+  void AcquireResources() {
+    if (runtime_hexbuffs) {
+      LOG(INFO) << "runtime_hexbuffs has already been created";

Review Comment:
   I considered this, but I didn't want it to register as an error, as we don't enforce that all allocations are explicitly cleaned up.   (Several tests hit this.)
   
   Edit - sorry, I thought this was a comment on release.  I'd like to keep this as a message to avoid breaking any existing workflows that may have more than one session.  We can discuss more offline.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] cconvey commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
cconvey commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r967850860


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   _Disclaimer: I've only looked briefly at the PR, so this question might be off-base._
   
   Depending on the intended lifespan of these allocations, would it be possible to move the allocation / deallocation logic into code that runs on the SoC?  E.g., into `tvm_rpc_android`?
   
   I could imagine a few benefits to this:
   - Depending on which resources we're talking about, some might be automatically freed by the SoC/Hexagon when the process is terminated.  So even if the host never has an opportunity to perform cleanup work, the Hexagon resources still are released.
   
     I'm thinking in particular about cases where the Hexagon code crashes or returns an error.  `tvm_rpc_server` either knows about the error, or (maybe?) `tvm_rpc_server` crashes as well.
   
   - Depending on the particular lifespans and automated-cleanup logics involved, we might never need to explicitly release certain resources.  E.g., the way Linux processes don't need to explicitly close their open file descriptors before terminating.
   
   (If this approach does seem appealing, we might want to fix the issue where sometimes `tvm_rpc_android` processes become zombies, since that might delay automated resource deallocations tied to that process.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] JosephTheOctonaut commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
JosephTheOctonaut commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966434334


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";
+    }
+  }
+
+  //! \brief Ensures we have freed all resources when we end the runtime
+  void ReleaseResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in ReleaseResources, resetting";
+      hexbuffs.reset();
+      hexbuffs = std::make_unique<HexagonBufferManager>();

Review Comment:
   Ahh, I didn't realize it needed an explicit constructor passed in. 
   
   Either style seems quite clear to me!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r972508664


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -50,6 +50,25 @@ class HexagonDeviceAPI final : public DeviceAPI {
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Ensures resource managers are in a good state for the runtime
+  void AcquireResources() {
+    if (runtime_hexbuffs) {
+      LOG(INFO) << "runtime_hexbuffs has already been created";

Review Comment:
   I considered this, but I didn't want it to register as an error, as we don't enforce that all allocations are explicitly cleaned up.   (Several tests hit this.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] adstraw commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
adstraw commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r968868937


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   Would it make more sense to put `hexbuffs = std::make_unique<HexagonBufferManager>();` in this function (`AcquireResources`) rather than in the constructor for the device API.  This way, we know that `hexbuffs` is empty at the time we run "acquire" - no need to check.  It would also alleviate the need to check whether `hexbuffs` is empty on "release" and we could simply `hexbuffs.reset()`.
   
   ```
   void AcquireResources() {
     CHECK_EQ(hexbuffs, nullptr);
     hexbuffs = std::make_unique<HexagonBufferManager>();
   }
   
   void ReleaseResources() {
     hexbuffs.reset();
     hexbuffs = nullptr;
   }
   ```
   
   The device api ctor could just set `hexbuffs = nullptr`.  We would need some logic to make sure that hexbuffs was not null on use or otherwise make sure that the use has run "acquire" to make this work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966430790


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   During my testing, I saw that there were allocations and frees before AcquireResources was called.  (During RPC initialization of this server). Sometimes due to timing, there were still allocations at this point.
   
   It is not the latter.  I agree that we need a scheme for each user/client to have its own resource manager.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] JosephTheOctonaut commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
JosephTheOctonaut commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966436276


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   I see --- so it's just that the resource manager is spinning up before the RPC initialization is totally done. So those are transient allocations that will (definitely?) be freed soon?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] tmoreau89 commented on pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
tmoreau89 commented on PR #12727:
URL: https://github.com/apache/tvm/pull/12727#issuecomment-1248124053

   CC @kparzysz-quic for additional input; thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r969898294


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   I have some pending updates that will keep a static object for allocations before and (potentially) after the runtime, and create/destroy the runtime manager in acquire/release.  Although not ideal, I think that's better.  The lifetime of runtime management is more clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r973138646


##########
python/tvm/contrib/hexagon/session.py:
##########
@@ -88,14 +88,23 @@ def __enter__(self):
                     self._rpc_receive_buffer_size_bytes,
                 ],
             )
+            func = self._rpc.get_function("device_api.hexagon.acquire_resources")
+            func()
             return self
 
         except RuntimeError as exception:
             raise exception
 
     def __exit__(self, exc_type, exc_value, exc_traceback):
-        # close session to the tracker
-        del self._rpc
+        try:
+            func = self._rpc.get_function("device_api.hexagon.release_resources")
+            func()
+        except RuntimeError:
+            # do nothing, as we are shutting down

Review Comment:
   We don't have access to _LOG here, but I added a print. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] areusch merged pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
areusch merged PR #12727:
URL: https://github.com/apache/tvm/pull/12727


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc closed pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc closed pull request #12727: [Hexagon] [runtime] Improve runtime resource management
URL: https://github.com/apache/tvm/pull/12727


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] Lunderberg commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
Lunderberg commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966186207


##########
python/tvm/contrib/hexagon/build.py:
##########
@@ -149,6 +149,18 @@ def stop_server(self):
         """Stop the RPC server"""
         ...
 
+    def _acquire_device_resources(self):
+        with self.start_session() as session:
+            """Call into device to initialize objects to acquire resources"""

Review Comment:
   Linting help: If applicable to the entire function, this string should be moved one line higher, so that it is the docstring of the function.  If applicable only to this `with` scope, it should be a comment instead of a string literal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r972508028


##########
tests/cpp-runtime/hexagon/hexagon_device_api_tests.cc:
##########
@@ -146,3 +146,21 @@ TEST_F(HexagonDeviceAPITest, DISABLED_alloc_free_diff_dev) {
   CHECK(buf != nullptr);
   EXPECT_THROW(hexapi->FreeDataSpace(cpu_dev, buf), InternalError);
 }
+
+// Alloc a non-runtime buffer
+// Alloc a runtime buffer
+// "Release" resources for runtime
+// Verify the runtime buffer cannot be freed, but the non-runtime buffer can
+// This test should be run last
+TEST_F(HexagonDeviceAPITest, z_leak_resources) {

Review Comment:
   I can remove - this was part of testing



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] cconvey commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
cconvey commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r967850860


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   _Disclaimer: I've only looked briefly at the PR, so this question might be off-base._
   
   Depending on the intended lifespan of these allocations, would it be possible to move the allocation / deallocation logic into code that runs on the SoC?  E.g., into `tvm_rpc_android`?
   
   I could imagine a few benefits to this:
   - Depending on which resources we're talking about, some might be automatically freed by the SoC/Hexagon when the process is terminated.  So even if the host never has an opportunity to perform cleanup work, the Hexagon resources still are released.
   
     I'm thinking in particular about cases where the Hexagon code crashes or returns an error.  `tvm_rpc_server` either knows about the error, or (maybe?) `tvm_rpc_server` crashes as well.
   
   - Depending on the particular lifespans and automated-cleanup logics involved, we might never need to explicitly release certain resources.  E.g., the way Linux processes don't need to explicitly close their open file descriptors before terminating.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r972509001


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -138,7 +157,11 @@ class HexagonDeviceAPI final : public DeviceAPI {
   }
 
   //! \brief Manages underlying HexagonBuffer allocations
+  // runtime_hexbuffs is used for runtime allocations.  It is created
+  // with a call to AcquireResources, and destroyed on ReleaseResources.

Review Comment:
   This is so we can ensure that resources are cleaned up for a runtime session, when we are doing tuning in a loop.  We want the client side to explicitly free resources.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r973227020


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -50,6 +50,25 @@ class HexagonDeviceAPI final : public DeviceAPI {
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Ensures resource managers are in a good state for the runtime
+  void AcquireResources() {
+    if (runtime_hexbuffs) {
+      LOG(INFO) << "runtime_hexbuffs has already been created";

Review Comment:
   This has been fixed - thanks for catching this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r973368886


##########
src/runtime/hexagon/hexagon_device_api.cc:
##########
@@ -90,18 +90,20 @@ void* HexagonDeviceAPI::AllocDataSpace(Device dev, int ndim, const int64_t* shap
 
   const size_t typesize = (dtype.bits / 8) * dtype.lanes;
 
+  HexagonBufferManager* mgr = runtime_hexbuffs ? runtime_hexbuffs.get() : &hexbuffs;

Review Comment:
   I was keeping the check for both managers in free.  However, now that I'm calling Acquire/Release in the session object directly, I can switch to having a "current" manager, and remove those trinary statements.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] Lunderberg commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
Lunderberg commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r971025248


##########
python/tvm/contrib/hexagon/session.py:
##########
@@ -88,14 +88,23 @@ def __enter__(self):
                     self._rpc_receive_buffer_size_bytes,
                 ],
             )
+            func = self._rpc.get_function("device_api.hexagon.acquire_resources")
+            func()
             return self
 
         except RuntimeError as exception:
             raise exception
 
     def __exit__(self, exc_type, exc_value, exc_traceback):
-        # close session to the tracker
-        del self._rpc
+        try:
+            func = self._rpc.get_function("device_api.hexagon.release_resources")
+            func()
+        except RuntimeError as exception:
+            # do nothing, as we are shutting down

Review Comment:
   Redacted Nitpick: I was going to make a nitpick about `...` and `pass`, but it looks like it's [a more open discussion](https://stackoverflow.com/questions/55274977/when-is-the-usage-of-the-python-ellipsis-to-be-preferred-over-pass/67769536) than I had thought.  I generally see `...` for function stubs and `pass` for explicitly doing nothing in a block, but my certainty has been shaken.



##########
python/tvm/contrib/hexagon/session.py:
##########
@@ -88,14 +88,23 @@ def __enter__(self):
                     self._rpc_receive_buffer_size_bytes,
                 ],
             )
+            func = self._rpc.get_function("device_api.hexagon.acquire_resources")
+            func()
             return self
 
         except RuntimeError as exception:
             raise exception
 
     def __exit__(self, exc_type, exc_value, exc_traceback):
-        # close session to the tracker
-        del self._rpc
+        try:
+            func = self._rpc.get_function("device_api.hexagon.release_resources")
+            func()
+        except RuntimeError as exception:

Review Comment:
   Looks like this is the line giving a lint error.  Where value of the exception isn't required, the `as` statement can be removed, so this line could be just `except RuntimeError`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] JosephTheOctonaut commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
JosephTheOctonaut commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966412900


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";
+    }
+  }
+
+  //! \brief Ensures we have freed all resources when we end the runtime
+  void ReleaseResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in ReleaseResources, resetting";
+      hexbuffs.reset();
+      hexbuffs = std::make_unique<HexagonBufferManager>();

Review Comment:
   Doesn't `reset()` already replace the managed pointer with a new one? So the new declaration is redundant?



##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   Is the expected situation here that a previous run crashed or exited without releasing the buffers? Or we're allowing for another user of the device to exist in parallel?
   
   If it's the former, could we just reset the buffer manager here? If it's the latter, it seems wrong to reset all buffers in `ReleaseResources()` (because some might belong to another running application).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] mehrdadh commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
mehrdadh commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966186024


##########
python/tvm/contrib/hexagon/build.py:
##########
@@ -149,6 +149,18 @@ def stop_server(self):
         """Stop the RPC server"""
         ...
 
+    def _acquire_device_resources(self):
+        with self.start_session() as session:
+            """Call into device to initialize objects to acquire resources"""

Review Comment:
   maybe move it to the beginning of the function?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966431706


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";
+    }
+  }
+
+  //! \brief Ensures we have freed all resources when we end the runtime
+  void ReleaseResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in ReleaseResources, resetting";
+      hexbuffs.reset();
+      hexbuffs = std::make_unique<HexagonBufferManager>();

Review Comment:
   reset() destroys the original object, and doesn't create a new one.
   
   I could change it to reset(new HexagonBufferManager)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] adstraw commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
adstraw commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r973333270


##########
src/runtime/hexagon/hexagon_device_api.cc:
##########
@@ -90,18 +90,20 @@ void* HexagonDeviceAPI::AllocDataSpace(Device dev, int ndim, const int64_t* shap
 
   const size_t typesize = (dtype.bits / 8) * dtype.lanes;
 
+  HexagonBufferManager* mgr = runtime_hexbuffs ? runtime_hexbuffs.get() : &hexbuffs;

Review Comment:
   It would be best if we could fully find and fix the "init time" uses of `hexbuffs` before merging this PR.
   
   If that's not possible, then `mgr` should be a member of HexagonDeviceAPI which is initialized with `&hexbuffs` and then set to `std::make_unique<HexagonBufferManager>()` in "acquire" to avoid all this trinary switching at runtime.



##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -50,6 +50,22 @@ class HexagonDeviceAPI final : public DeviceAPI {
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Ensures resource managers are in a good state for the runtime
+  void AcquireResources() {
+    CHECK_EQ(runtime_hexbuffs, nullptr);
+    runtime_hexbuffs = std::make_unique<HexagonBufferManager>();

Review Comment:
   Set `mgr` member variable here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] mehrdadh commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
mehrdadh commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966187693


##########
python/tvm/contrib/hexagon/build.py:
##########
@@ -149,6 +149,18 @@ def stop_server(self):
         """Stop the RPC server"""
         ...
 
+    def _acquire_device_resources(self):
+        with self.start_session() as session:
+            """Call into device to initialize objects to acquire resources"""
+            func = session._rpc.get_function("device_api.hexagon.acquire_resources")
+            func()
+
+    def _release_device_resources(self):
+        with self.start_session() as session:
+            """Call into device to release resources"""

Review Comment:
   same here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r966454458


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   You raise a good point - I'm going to investigate this more before proceeding.
   
   I saw different behavior in the simulator vs real devices.  (Timing is different?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r971105911


##########
python/tvm/contrib/hexagon/session.py:
##########
@@ -88,14 +88,23 @@ def __enter__(self):
                     self._rpc_receive_buffer_size_bytes,
                 ],
             )
+            func = self._rpc.get_function("device_api.hexagon.acquire_resources")
+            func()
             return self
 
         except RuntimeError as exception:
             raise exception
 
     def __exit__(self, exc_type, exc_value, exc_traceback):
-        # close session to the tracker
-        del self._rpc
+        try:
+            func = self._rpc.get_function("device_api.hexagon.release_resources")
+            func()
+        except RuntimeError as exception:

Review Comment:
   The lint error was regarding the variable "exception".  I fixed it with my latest commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r968749921


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   I verified that all items that were allocated before the call to "acquire" are indeed released by the time we release.
   
   We do create the resources on the device side now, statically.  This is attempting to have the resource managers only created for the session, and then explicitly destroyed before the process is terminated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] adstraw commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
adstraw commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r968868937


##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -45,11 +45,27 @@ class HexagonDeviceAPI final : public DeviceAPI {
   static HexagonDeviceAPI* Global();
 
   //! \brief Constructor
-  HexagonDeviceAPI() {}
+  HexagonDeviceAPI() { hexbuffs = std::make_unique<HexagonBufferManager>(); }
 
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Creates resource managers for the runtime
+  void AcquireResources() {
+    if (!hexbuffs->empty()) {
+      LOG(INFO) << "hexbuffs was not empty in AcquireResources";

Review Comment:
   Would it make more sense to put `hexbuffs = std::make_unique<HexagonBufferManager>();` in this function (`AcquireResources`) rather than in the constructor for the device API.  This way, we know that `hexbuffs` is empty at the time we run "acquire" - no need to check.  It would also alleviate the need to check whether `hexbuffs` is empty on "release" and we could simply `hexbuffs.reset()`.
   
   ```
   void AcquireResources() {
     CHECK_EQ(hexbuffs, nullptr);
     hexbuffs = std::make_unique<HexagonBufferManager>();
   }
   
   void ReleaseResources() {
     hexbuffs.reset();
     hexbuffs = nullptr;
   }
   ```
   
   The device api ctor could just set hexbuffs=nullptr.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] areusch commented on a diff in pull request #12727: [Hexagon] [runtime] Improve runtime resource management

Posted by GitBox <gi...@apache.org>.
areusch commented on code in PR #12727:
URL: https://github.com/apache/tvm/pull/12727#discussion_r972473903


##########
tests/cpp-runtime/hexagon/hexagon_device_api_tests.cc:
##########
@@ -146,3 +146,21 @@ TEST_F(HexagonDeviceAPITest, DISABLED_alloc_free_diff_dev) {
   CHECK(buf != nullptr);
   EXPECT_THROW(hexapi->FreeDataSpace(cpu_dev, buf), InternalError);
 }
+
+// Alloc a non-runtime buffer
+// Alloc a runtime buffer
+// "Release" resources for runtime
+// Verify the runtime buffer cannot be freed, but the non-runtime buffer can
+// This test should be run last
+TEST_F(HexagonDeviceAPITest, z_leak_resources) {

Review Comment:
   just curious, what's "z_" mean here?



##########
python/tvm/contrib/hexagon/session.py:
##########
@@ -88,14 +88,23 @@ def __enter__(self):
                     self._rpc_receive_buffer_size_bytes,
                 ],
             )
+            func = self._rpc.get_function("device_api.hexagon.acquire_resources")
+            func()
             return self
 
         except RuntimeError as exception:
             raise exception
 
     def __exit__(self, exc_type, exc_value, exc_traceback):
-        # close session to the tracker
-        del self._rpc
+        try:
+            func = self._rpc.get_function("device_api.hexagon.release_resources")
+            func()
+        except RuntimeError:
+            # do nothing, as we are shutting down

Review Comment:
   we could probably `_LOG.warn("Exception occurred while calling release_resources() during Session __exit__", exc_info=1)` here



##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -50,6 +50,25 @@ class HexagonDeviceAPI final : public DeviceAPI {
   //! \brief Destructor
   ~HexagonDeviceAPI() {}
 
+  //! \brief Ensures resource managers are in a good state for the runtime
+  void AcquireResources() {
+    if (runtime_hexbuffs) {
+      LOG(INFO) << "runtime_hexbuffs has already been created";

Review Comment:
   should this be an error, since it might be dirty now?



##########
src/runtime/hexagon/hexagon_device_api.h:
##########
@@ -138,7 +157,11 @@ class HexagonDeviceAPI final : public DeviceAPI {
   }
 
   //! \brief Manages underlying HexagonBuffer allocations
+  // runtime_hexbuffs is used for runtime allocations.  It is created
+  // with a call to AcquireResources, and destroyed on ReleaseResources.

Review Comment:
   could you add some explanation of why the extra, non-RAII behavior is taken here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org