You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/07/13 04:11:18 UTC

[GitHub] [incubator-tvm] merrymercy opened a new pull request #6040: [TOPI] Improve schedule for injective on x86

merrymercy opened a new pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040


   The existing schedule does not handle parallel correctly for all shapes
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659117283


   cc @merrymercy please followup


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-660150798


   If c++ version is good we can call into the c++ version from the pyone


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659141089


   I added a python version. It handles parallel and vectorization generally for most input shapes. (However, in my test, the vectorization part is not useful. I guess llvm can auto vectorize these easy cases)
   I tested in with BERT (seq_len=128, batch=1) on c5.9xlarge. It reduces the latency from 19.0 ms to 18.7 ms.
   
   My questions:
   1. Why do we have both c++ version and python version? Should we just delete the python version?
   2. I see the same name (topi.x86.schedule_injective) appears in both python and c++. Which one overwrites the other one? It is not clear in the code and leads to confusion.
   
   - python
   https://github.com/apache/incubator-tvm/blob/71b48adf64ffe977c0c89a5fba7b6051eab891d7/topi/python/topi/x86/injective.py#L94
   - c++
   https://github.com/apache/incubator-tvm/blob/dff715a54ee0f02f27b4f4efde08c77e86eff2d2/topi/src/schedule.cc#L99


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-662727589


   I close this PR because after more testing I found the current python version in the master works well. This pr is trying to fix the c++ version, which is only used in the old codebase.
   I also noticed a slight unstable performance regression on resnet-50 by using my version. The average inference time drops from 5.9ms to 6.1ms. I believe the performance will vary on different models and different machines.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-660172683


   Currently, both c++ and python version work well. This PR is ready to be merged from my side.  cc @icemelon9 
   I have no idea how to solve the mess of duplicated python and c++ topi functions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-662727589


   I close this PR because after more testing I found the current python version in the master works well. This pr is trying to fix the c++ version, which is only used in the old codebase.
   I also noticed a slight unstable performance regression on resnet-50 by using my version.
   I cannot figure out the reason and don't want to go deeper.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-662727589


   I close this PR because after more testing I found the current python version in the master works well. This pr is trying to fix the c++ version, which is only used in the old codebase.
   I also noticed a slight unstable performance regression on resnet-50 by using my version. The average inference time drops from 5.9ms to 6.1ms. I believe which version is better depends on different models and different machines.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-662727589


   I close this PR because after more testing I found the current python version in the master works well. This pr is trying to fix the c++ version, which is only used in the old codebase.
   I also noticed a slight unstable performance regression on resnet-50 and I cannot figure out the reason.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] kevinthesun commented on a change in pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
kevinthesun commented on a change in pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#discussion_r453802772



##########
File path: topi/include/topi/x86/injective.h
##########
@@ -44,14 +47,29 @@ namespace x86 {
  * \return The updated schedule.
  */
 inline Schedule schedule_injective_from_existing(Schedule sch, const Tensor& out) {
-  auto axis = sch[out]->op.as<ComputeOpNode>()->axis;
-  if (axis.size() == 4) {
-    auto n = axis[0];
-    auto c = axis[1];
-    auto fused = detail::Fuse(sch[out], {n, c});  // for nhwc layout, fuse n and h
-    sch[out].parallel(fused);
+  const auto& axes = sch[out]->op.as<ComputeOpNode>()->axis;

Review comment:
       Is this schedule func used in the common compilation flow? In op strategy we use python version of injective schedule and it should handle better than this func. However, this looks a more general solution. Should we use this method for op strategy? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659141089


   I added a python version. It handles parallel and vectorization generally for most input shapes. (However, in my test, the vectorization part is not useful. I guess llvm can auto vectorize these easy cases)
   I tested in with BERT (seq_len=128, batch=1) on c5.9xlarge. It reduces the latency from 19.0 ms to 18.7 ms.
   
   My questions:
   1. Why do we have both c++ version and python version? Should we just delete the python version?
   2. I see the same name (topi.x86.schedule_injective) appears in both python and c++. Which one overwrites the other one? It is not clear in the code.
   
   - python
   https://github.com/apache/incubator-tvm/blob/71b48adf64ffe977c0c89a5fba7b6051eab891d7/topi/python/topi/x86/injective.py#L94
   - c++
   https://github.com/apache/incubator-tvm/blob/dff715a54ee0f02f27b4f4efde08c77e86eff2d2/topi/src/schedule.cc#L99


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] kevinthesun commented on a change in pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
kevinthesun commented on a change in pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#discussion_r453802772



##########
File path: topi/include/topi/x86/injective.h
##########
@@ -44,14 +47,29 @@ namespace x86 {
  * \return The updated schedule.
  */
 inline Schedule schedule_injective_from_existing(Schedule sch, const Tensor& out) {
-  auto axis = sch[out]->op.as<ComputeOpNode>()->axis;
-  if (axis.size() == 4) {
-    auto n = axis[0];
-    auto c = axis[1];
-    auto fused = detail::Fuse(sch[out], {n, c});  // for nhwc layout, fuse n and h
-    sch[out].parallel(fused);
+  const auto& axes = sch[out]->op.as<ComputeOpNode>()->axis;

Review comment:
       Is this schedule func used in the common compilation flow? In op strategy we use python version of injective schedule and it should handle parallelism appropriately. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659141089


   I added a python version. It handles parallel and vectorization generally for most input shapes.
   I tested in with BERT (seq_len=128, batch=1) on c5.9xlarge. It reduces the latency from 19.0ms to 18.70ms.
   
   My questions:
   1. Why do we have both c++ version and python version? Should we just delete the python version?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] kevinthesun commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
kevinthesun commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-660437382


   Yes. If we haven't done so, I suggest testing at least resnet50  to make sure no any performance regression occurs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] icemelon9 commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
icemelon9 commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-660430944


   @merrymercy Yes, I agree with @tqchen. If c++ version works well, we could just remove the python implementation and use c++ one instead.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-660172510


   Currently, both c++ and python versions work well. This PR is ready to be merged from my side.
   I have no idea how to solve the mess of duplicated python and c++ topi functions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659141089


   I added a python version. It handles parallel and vectorization generally for most input shapes. (However, in my test, the vectorization part is not useful. I guess llvm can handle it)
   I tested in with BERT (seq_len=128, batch=1) on c5.9xlarge. It reduces the latency from 19.0 ms to 18.7 ms.
   
   My questions:
   1. Why do we have both c++ version and python version? Should we just delete the python version?
   2. I see the same name (topi.x86.schedule_injective) appears in both python and c++. Which one overwrite the other one? It is not clear in the code.
   
   - python
   https://github.com/apache/incubator-tvm/blob/71b48adf64ffe977c0c89a5fba7b6051eab891d7/topi/python/topi/x86/injective.py#L94
   - c++
   https://github.com/apache/incubator-tvm/blob/dff715a54ee0f02f27b4f4efde08c77e86eff2d2/topi/src/schedule.cc#L99


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659141089


   I added a python version. It handles parallel and vectorization generally for most input shapes.
   I tested in with BERT (seq_len=128, batch=1) on c5.9xlarge. It reduces the latency from 19.0 ms to 18.7 ms.
   
   My questions:
   1. Why do we have both c++ version and python version? Should we just delete the python version?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659138546


   Why do we have both c++ version and python version? Should we just delete the python version?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-662727589






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy removed a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy removed a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659138546


   Why do we have both c++ version and python version? Should we just delete the python version?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy closed pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy closed pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] icemelon9 commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
icemelon9 commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-657720251


   Could you also update the injective schedule in the python as well?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659141089


   I added a python version. It handles parallel and vectorization generally for most input shapes.
   I tested in with BERT (seq_len=128, batch=1) on c5.9xlarge. It reduces the latency from 19.0 ms to 18.7 ms.
   
   My questions:
   1. Why do we have both c++ version and python version? Should we just delete the python version?
   2. I see the same name (topi.x86.schedule_injective) appears in both python and c++. Which one overwrite the other one? It is not clear in the code.
   
   - python
   https://github.com/apache/incubator-tvm/blob/71b48adf64ffe977c0c89a5fba7b6051eab891d7/topi/python/topi/x86/injective.py#L94
   - c++
   https://github.com/apache/incubator-tvm/blob/dff715a54ee0f02f27b4f4efde08c77e86eff2d2/topi/src/schedule.cc#L99


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] icemelon9 commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
icemelon9 commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-660431091


   Probably also worth checking with @kevinthesun. I remember last time he saw some performance regression due to the injective schedule.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy removed a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy removed a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-660172510


   Currently, both c++ and python versions work well. This PR is ready to be merged from my side.
   I have no idea how to solve the mess of duplicated python and c++ topi functions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-661483315


   @merrymercy please follow up


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-662727589


   I close this PR because after more test I found the current python version in tvm master works well. This pr is trying to fix the c++ version, which is only used in the old codebase.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on a change in pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#discussion_r453436671



##########
File path: topi/include/topi/x86/injective.h
##########
@@ -44,14 +47,29 @@ namespace x86 {
  * \return The updated schedule.
  */
 inline Schedule schedule_injective_from_existing(Schedule sch, const Tensor& out) {
-  auto axis = sch[out]->op.as<ComputeOpNode>()->axis;
-  if (axis.size() == 4) {
-    auto n = axis[0];
-    auto c = axis[1];
-    auto fused = detail::Fuse(sch[out], {n, c});  // for nhwc layout, fuse n and h
-    sch[out].parallel(fused);
+  const auto& axes = sch[out]->op.as<ComputeOpNode>()->axis;
+  std::vector<IterVar> to_fuse;
+  int64_t prod_len = 1;
+
+  for (const auto& axis : axes) {
+    const auto* pint = axis->dom->extent.as<IntImmNode>();
+    if (pint == nullptr) {
+      break;
+    }
+    prod_len *= pint->value;
+    to_fuse.push_back(axis);
+    if (prod_len >= tvm::runtime::threading::MaxConcurrency()) {

Review comment:
       According to build CI information, we should change the `visibility` of `tvm::runtime::threading::MaxConcurrency` according to append `TVM_DLL`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-657384814


   I pushed some changes to my branch. But why does not GitHub sync the changes to this PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy commented on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy commented on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-660172683


   Currently, both c++ and python versions work well. This PR is ready to be merged from my side.  cc @icemelon9 
   I have no idea how to solve the mess of duplicated python and c++ topi functions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-662727589


   I close this PR because after more testing I found the current python version in the master works well. This pr is trying to fix the c++ version, which is only used in the old codebase.
   I also noticed a slight unstable performance regression on resnet-50 by using my version. The average inference time drops from 5.9ms to 6.1ms.
   I cannot figure out the reason and don't want to go deeper.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] merrymercy edited a comment on pull request #6040: [TOPI] Improve schedule for injective on x86

Posted by GitBox <gi...@apache.org>.
merrymercy edited a comment on pull request #6040:
URL: https://github.com/apache/incubator-tvm/pull/6040#issuecomment-659141089


   I added a python version. It handles parallel and vectorization generally for most input shapes. (However, in my test, the vectorization part is not useful. I guess llvm can auto vectorize these easy cases)
   I tested in with BERT (seq_len=128, batch=1) on c5.9xlarge. It reduces the latency from 19.0 ms to 18.7 ms.
   
   My questions:
   1. Why do we have both c++ version and python version? Should we just delete the python version?
   2. I see the same name (topi.x86.schedule_injective) appears in both python and c++. Which one overwrite the other one? It is not clear in the code.
   
   - python
   https://github.com/apache/incubator-tvm/blob/71b48adf64ffe977c0c89a5fba7b6051eab891d7/topi/python/topi/x86/injective.py#L94
   - c++
   https://github.com/apache/incubator-tvm/blob/dff715a54ee0f02f27b4f4efde08c77e86eff2d2/topi/src/schedule.cc#L99


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org