You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/12/05 04:56:56 UTC

[GitHub] [tvm] merrymercy commented on a change in pull request #7038: [ROCm][Auto schedular] Support Auto schedular and NHWC convolution on ROCm

merrymercy commented on a change in pull request #7038:
URL: https://github.com/apache/tvm/pull/7038#discussion_r536513571



##########
File path: src/auto_scheduler/feature.cc
##########
@@ -1296,7 +1296,8 @@ void GetPerStoreFeaturesWorkerFunc(const SearchTask& task, const State& state, i
     }
     auto mod = IRModule(Map<GlobalVar, BaseFunc>({{global_var, f}}));
 
-    if (task->target->kind->device_type == kDLGPU) {
+    auto device_type = task->target->kind->device_type;
+    if (device_type == kDLGPU || device_type == kDLROCM) {

Review comment:
       To align with the search policy, we can try to the condition from this function
   https://github.com/apache/tvm/blob/fd5ce645941153972ecee404c90479b2b391df15/src/auto_scheduler/search_policy/utils.h#L55-L62

##########
File path: src/auto_scheduler/search_task.cc
##########
@@ -66,6 +69,13 @@ HardwareParams HardwareParamsNode::GetDefaultHardwareParams(const Target& target
 
     device_api->GetAttr(ctx, tvm::runtime::DeviceAttrKind::kMaxRegistersPerBlock, &ret);
     int max_registers_per_block = ret;

Review comment:
       I think this is a bug.
   I will send another PR to rename `max_registers_per_block` to `max_local_memory_per_block` to make it align with `VerifyGPUCode` pass.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org