You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/09/30 21:11:10 UTC

[GitHub] [tvm] kparzysz-quic opened a new pull request, #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

kparzysz-quic opened a new pull request, #12957:
URL: https://github.com/apache/tvm/pull/12957

   Use HAP messaging instead. This greatly reduced execution overhead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
janetsc commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1265677915

   > 
   
   Ah, I see some tests are throwing this even for runtime_hexbuffs (before ReleaseResources is called.). I'm going to continue to investigate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] kparzysz-quic commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
kparzysz-quic commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1265790665

   Closing this, will come up with a smaller fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] kparzysz-quic commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
kparzysz-quic commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1265647363

   > I tested this locally, which I prefer to removing the assert in `FreeHexagonBuffer`. This will catch the error if we are operating on the static buffer manager:
   
   This still throws an exception.  The whole point of this is not to throw it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] kparzysz-quic closed pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
kparzysz-quic closed pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime
URL: https://github.com/apache/tvm/pull/12957


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on a diff in pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
janetsc commented on code in PR #12957:
URL: https://github.com/apache/tvm/pull/12957#discussion_r985329003


##########
src/runtime/hexagon/hexagon_buffer_manager.h:
##########
@@ -39,11 +38,17 @@ class HexagonBufferManager {
    * \param ptr Address of the HexagonBuffer as returned by `AllocateHexagonBuffer`.
    */
   void FreeHexagonBuffer(void* ptr) {
-    auto it = hexagon_buffer_map_.find(ptr);
-    CHECK(it != hexagon_buffer_map_.end())
-        << "Attempt made to free unknown or already freed dataspace allocation";
-    CHECK(it->second != nullptr);
-    {
+    if (auto it = hexagon_buffer_map_.find(ptr); it == hexagon_buffer_map_.end()) {
+      // This should be an assertion, but something seems to go wrong here.
+      // The symptom is that when resources are being released (ReleaseResources),
+      // one buffer disappears from the "runtime" buffer manager, and is not deleted
+      // when that manager is reset. The FreeHexagonBuffer is than called for that
+      // buffer, but with the "static" buffer manager instead. That manager doesn't
+      // find it in the map and throws an exception, which somehow doesn't abort
+      // the program.
+      HEXAGON_PRINT(ERROR, "Attempt made to free unknown or already freed dataspace allocation");
+    } else {
+      HEXAGON_ASSERT(it->second != nullptr);

Review Comment:
   I will investigate the buffer free after ReleaseResources.  This happens on session shutdown when the process is already being torn down.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] kparzysz-quic commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
kparzysz-quic commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1265559044

   The degradation started a while back, but we didn't notice, because the tests were passing without timing out.  The main contributor to the degradation is the exception being thrown, but I only narrowed it down after creating this PR (I used it locally to unblock our tests---that was an urgent thing to do).
   
   I can revert it, and only remove the actual assertion in free-after-release.  That situation shouldn't be happening---there is something weird going on there.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] kparzysz-quic commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
kparzysz-quic commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1265482652

   Our local simulator tests would time out after 1h, now they finish within 30 min.  I used a subset of the fp16 conv tests, and without this change they would take about 45s per test, after the change they take about 15s per test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
janetsc commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1265594905

   > The degradation started a while back, but we didn't notice, because the tests were passing without timing out. The main contributor to the degradation is the exception being thrown, but I only narrowed it down after creating this PR (I used it locally to unblock our tests---that was an urgent thing to do).
   > 
   > I can revert it, and only remove the actual assertion in free-after-release. That situation shouldn't be happening---there is something weird going on there.
   
   I tested this locally, which I prefer to removing the assert.  This will catch the error if we are operating on the static buffer manager:
   
   ```
   void HexagonDeviceAPI::FreeDataSpace(Device dev, void* ptr) {
     CHECK(ptr) << "buffer pointer is null";
     CHECK(IsValidDevice(dev)) << "dev.device_type: " << dev.device_type;
     if (mgr == &hexbuffs) {
       // Free can throw if it is called after ReleaseResources
       try {
         mgr->FreeHexagonBuffer(ptr);
       } catch (int e) {
         LOG(INFO) << "FreeDataSpace called for static manager.  An exception occurred " << e;
       }
     } else {
       mgr->FreeHexagonBuffer(ptr);
     }
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] kparzysz-quic commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
kparzysz-quic commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1264033737

   Please don't use logging.h (LOG, CHECK, etc.) in the Hexagon RT, at least in the performance-sensitive parts of it.
   @janetsc @supersat @adstraw 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] kparzysz-quic commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
kparzysz-quic commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1264487126

   A large part of the slowness came from exception handling, specifically from the failing assertion in `FreeHexagonBuffer` about trying to free an unknown buffer.  There is something weird going on there and I have disabled that assertion (left the message).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] janetsc commented on pull request #12957: [Hexagon] Remove TVM logging from performance-sensitive parts of runtime

Posted by GitBox <gi...@apache.org>.
janetsc commented on PR #12957:
URL: https://github.com/apache/tvm/pull/12957#issuecomment-1265527449

   > 
   
   
   
   > Our local simulator tests would time out after 1h, now they finish within 30 min. I used a subset of the fp16 conv tests, and without this change they would take about 45s per test, after the change they take about 15s per test.
   
   Have your simulator runs always taken this long, or did this recently start happening?  I am wondering if I introduced this will resource management, and this points to needing to handle Free better after ReleaseResources has been called.  (And that's the root culprit, not the logging, which should only be processed upon a failed check.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org