You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/03/20 20:25:11 UTC

[GitHub] [tvm] echuraev opened a new pull request #7714: [METAL] Fix memory leaks in Metal runtime

echuraev opened a new pull request #7714:
URL: https://github.com/apache/tvm/pull/7714

1. In case when we build runtime without ARC, we can have problems with
memory releasing. Due to some of Objective-C methods returns
autoreleased pointers, we should specify `autoreleasepool` blocks to
determine life cycle of these pointers.
2. Added fix for problem with work group size.
Sometimes auto scheduler generates parameters when work group size
is more than possible. And in this case we got assert from Metal
library. Added check for this situation and it helps to avoid
assert.
3. Fixed memory leak problem when fill tensor by random data.
DLManagedTensor increases reference counter in NDArray but nobody
delete this DLManagedTensor in proper way. This is why memory which
was allocated by NDArray was never released.
4. Removed unnecessary retains. It is not necessary to use retain in some
places where they were used, due to we build metal runtime without
ARC.

Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @ them in the pull request thread.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] echuraev commented on a change in pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

echuraev commented on a change in pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#discussion_r598418659



##########
File path: src/runtime/metal/metal_device_api.mm
##########
@@ -147,102 +155,110 @@ int GetWarpSize(id<MTLDevice> dev) {
 
 void* MetalWorkspace::AllocDataSpace(TVMContext ctx, size_t nbytes, size_t alignment,
                                      DLDataType type_hint) {
-  this->Init();
-  id<MTLDevice> dev = GetDevice(ctx);
-  // GPU memory only
-  MTLResourceOptions storage_mode = MTLResourceStorageModePrivate;
-  /*
-  #if TARGET_OS_IPHONE
-  storage_mode = MTLResourceStorageModeShared;
-  #else
-  storage_mode = MTLResourceStorageModeManaged;
-  #endif
-  */
-  id<MTLBuffer> buf = [dev newBufferWithLength:nbytes options:storage_mode];
-  ICHECK(buf != nil);
-  return (void*)(CFBridgingRetain(buf));
+  @autoreleasepool {
+    this->Init();
+    id<MTLDevice> dev = GetDevice(ctx);
+    // GPU memory only
+    MTLResourceOptions storage_mode = MTLResourceStorageModePrivate;
+    /*
+    #if TARGET_OS_IPHONE
+    storage_mode = MTLResourceStorageModeShared;
+    #else
+    storage_mode = MTLResourceStorageModeManaged;
+    #endif
+    */
+    id<MTLBuffer> buf = [dev newBufferWithLength:nbytes options:storage_mode];
+    ICHECK(buf != nil);
+    return (void*)(buf);

Review comment:
       `buf` won't be auto-released in this function. Quote from Apple documentation:
   > You take ownership of an object if you create it using a method whose name begins with “alloc” or “new” or contains “copy” (for example, alloc, newObject, or mutableCopy), or if you send it a retain message. You are responsible for relinquishing ownership of objects you own using release or autorelease. Any other time you receive an object, you must not release it.
   
   We create `buf` by using method with "new". So, the `newBufferWithLength` method returns the raw pointer on the memory not autoreleased pointer. Thus, we should call `release` when it will be necessary to free this memory.
   Also, I double-checked this by printing pointer address in `AllocDataSpace` and `FreeDataSpace` the addresses were the same.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tqchen commented on pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

tqchen commented on pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#issuecomment-804910681


   Thanks @echuraev 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tqchen commented on a change in pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

tqchen commented on a change in pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#discussion_r598321001



##########
File path: src/runtime/metal/metal_device_api.mm
##########
@@ -147,102 +155,110 @@ int GetWarpSize(id<MTLDevice> dev) {
 
 void* MetalWorkspace::AllocDataSpace(TVMContext ctx, size_t nbytes, size_t alignment,
                                      DLDataType type_hint) {
-  this->Init();
-  id<MTLDevice> dev = GetDevice(ctx);
-  // GPU memory only
-  MTLResourceOptions storage_mode = MTLResourceStorageModePrivate;
-  /*
-  #if TARGET_OS_IPHONE
-  storage_mode = MTLResourceStorageModeShared;
-  #else
-  storage_mode = MTLResourceStorageModeManaged;
-  #endif
-  */
-  id<MTLBuffer> buf = [dev newBufferWithLength:nbytes options:storage_mode];
-  ICHECK(buf != nil);
-  return (void*)(CFBridgingRetain(buf));
+  @autoreleasepool {
+    this->Init();
+    id<MTLDevice> dev = GetDevice(ctx);
+    // GPU memory only
+    MTLResourceOptions storage_mode = MTLResourceStorageModePrivate;
+    /*
+    #if TARGET_OS_IPHONE
+    storage_mode = MTLResourceStorageModeShared;
+    #else
+    storage_mode = MTLResourceStorageModeManaged;
+    #endif
+    */
+    id<MTLBuffer> buf = [dev newBufferWithLength:nbytes options:storage_mode];
+    ICHECK(buf != nil);
+    return (void*)(buf);

Review comment:
       It would be good to double check what is happening in here. Would buf get auto-released in the end of the function? (Thus the memory get re-allocated in another place?) Should be continue to use CFBridgingRetain so that the memory is retained? Note that the buf itself is not stored anywhere in the code




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tqchen merged pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

tqchen merged pull request #7714:
URL: https://github.com/apache/tvm/pull/7714


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] echuraev commented on a change in pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

echuraev commented on a change in pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#discussion_r598427368



##########
File path: src/runtime/contrib/random/mt_random_engine.cc
##########
@@ -126,8 +126,10 @@ class RandomEngine {
     } else {
       runtime::NDArray local = runtime::NDArray::Empty(
           std::vector<int64_t>{data->shape, data->shape + data->ndim}, data->dtype, {kDLCPU, 0});
-      FillData(&local.ToDLPack()->dl_tensor, size);
-      runtime::NDArray::CopyFromTo(&local.ToDLPack()->dl_tensor, data);
+      DLManagedTensor* dmt = local.ToDLPack();

Review comment:
       Yes, but `local.operator->()` returns pointer to const DLTensor (`const DLTensor*`). So we won't be able to use it in `FillData` method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tqchen commented on a change in pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

tqchen commented on a change in pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#discussion_r598716824



##########
File path: src/runtime/metal/metal_device_api.mm
##########
@@ -147,102 +155,110 @@ int GetWarpSize(id<MTLDevice> dev) {
 
 void* MetalWorkspace::AllocDataSpace(TVMContext ctx, size_t nbytes, size_t alignment,
                                      DLDataType type_hint) {
-  this->Init();
-  id<MTLDevice> dev = GetDevice(ctx);
-  // GPU memory only
-  MTLResourceOptions storage_mode = MTLResourceStorageModePrivate;
-  /*
-  #if TARGET_OS_IPHONE
-  storage_mode = MTLResourceStorageModeShared;
-  #else
-  storage_mode = MTLResourceStorageModeManaged;
-  #endif
-  */
-  id<MTLBuffer> buf = [dev newBufferWithLength:nbytes options:storage_mode];
-  ICHECK(buf != nil);
-  return (void*)(CFBridgingRetain(buf));
+  @autoreleasepool {
+    this->Init();
+    id<MTLDevice> dev = GetDevice(ctx);
+    // GPU memory only
+    MTLResourceOptions storage_mode = MTLResourceStorageModePrivate;
+    /*
+    #if TARGET_OS_IPHONE
+    storage_mode = MTLResourceStorageModeShared;
+    #else
+    storage_mode = MTLResourceStorageModeManaged;
+    #endif
+    */
+    id<MTLBuffer> buf = [dev newBufferWithLength:nbytes options:storage_mode];
+    ICHECK(buf != nil);
+    return (void*)(buf);

Review comment:
       OK, thanks @echuraev for good explaination




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tqchen commented on a change in pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

tqchen commented on a change in pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#discussion_r598717260



##########
File path: src/runtime/contrib/random/mt_random_engine.cc
##########
@@ -126,8 +126,10 @@ class RandomEngine {
     } else {
       runtime::NDArray local = runtime::NDArray::Empty(
           std::vector<int64_t>{data->shape, data->shape + data->ndim}, data->dtype, {kDLCPU, 0});
-      FillData(&local.ToDLPack()->dl_tensor, size);
-      runtime::NDArray::CopyFromTo(&local.ToDLPack()->dl_tensor, data);
+      DLManagedTensor* dmt = local.ToDLPack();

Review comment:
       we can use const_cast here for now




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] echuraev commented on a change in pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

echuraev commented on a change in pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#discussion_r598731541



##########
File path: src/runtime/contrib/random/mt_random_engine.cc
##########
@@ -126,8 +126,10 @@ class RandomEngine {
     } else {
       runtime::NDArray local = runtime::NDArray::Empty(
           std::vector<int64_t>{data->shape, data->shape + data->ndim}, data->dtype, {kDLCPU, 0});
-      FillData(&local.ToDLPack()->dl_tensor, size);
-      runtime::NDArray::CopyFromTo(&local.ToDLPack()->dl_tensor, data);
+      DLManagedTensor* dmt = local.ToDLPack();

Review comment:
       Done. Thank you for your comment




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tqchen commented on a change in pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

tqchen commented on a change in pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#discussion_r598321001



##########
File path: src/runtime/metal/metal_device_api.mm
##########
@@ -147,102 +155,110 @@ int GetWarpSize(id<MTLDevice> dev) {
 
 void* MetalWorkspace::AllocDataSpace(TVMContext ctx, size_t nbytes, size_t alignment,
                                      DLDataType type_hint) {
-  this->Init();
-  id<MTLDevice> dev = GetDevice(ctx);
-  // GPU memory only
-  MTLResourceOptions storage_mode = MTLResourceStorageModePrivate;
-  /*
-  #if TARGET_OS_IPHONE
-  storage_mode = MTLResourceStorageModeShared;
-  #else
-  storage_mode = MTLResourceStorageModeManaged;
-  #endif
-  */
-  id<MTLBuffer> buf = [dev newBufferWithLength:nbytes options:storage_mode];
-  ICHECK(buf != nil);
-  return (void*)(CFBridgingRetain(buf));
+  @autoreleasepool {
+    this->Init();
+    id<MTLDevice> dev = GetDevice(ctx);
+    // GPU memory only
+    MTLResourceOptions storage_mode = MTLResourceStorageModePrivate;
+    /*
+    #if TARGET_OS_IPHONE
+    storage_mode = MTLResourceStorageModeShared;
+    #else
+    storage_mode = MTLResourceStorageModeManaged;
+    #endif
+    */
+    id<MTLBuffer> buf = [dev newBufferWithLength:nbytes options:storage_mode];
+    ICHECK(buf != nil);
+    return (void*)(buf);

Review comment:
       It would be good to double check what is happening in here. Would buf get auto-released in the end of the function? (Thus the memory get re-allocated in another place?) Should be continue to use CFBridgingRetain?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tqchen commented on a change in pull request #7714: [METAL] Fix memory leaks in Metal runtime

Posted by GitBox <gi...@apache.org>.

tqchen commented on a change in pull request #7714:
URL: https://github.com/apache/tvm/pull/7714#discussion_r598287539



##########
File path: src/runtime/contrib/random/mt_random_engine.cc
##########
@@ -126,8 +126,10 @@ class RandomEngine {
     } else {
       runtime::NDArray local = runtime::NDArray::Empty(
           std::vector<int64_t>{data->shape, data->shape + data->ndim}, data->dtype, {kDLCPU, 0});
-      FillData(&local.ToDLPack()->dl_tensor, size);
-      runtime::NDArray::CopyFromTo(&local.ToDLPack()->dl_tensor, data);
+      DLManagedTensor* dmt = local.ToDLPack();

Review comment:
       DLManagedTensor is not needed, we can directly use local.operator->() to get the DLTensor pointer




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org